Summary: | [igb] [panic] RELENG_9 panics on boot in IGB driver - [regression] from 8.2 | ||||||
---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Frank Terhaar-Yonkers <fty> | ||||
Component: | kern | Assignee: | freebsd-net (Nobody) <net> | ||||
Status: | Closed FIXED | ||||||
Severity: | Affects Only Me | CC: | sbruno, shurd | ||||
Priority: | Normal | Keywords: | IntelNetworking | ||||
Version: | Unspecified | ||||||
Hardware: | Any | ||||||
OS: | Any | ||||||
Attachments: |
|
Description
Frank Terhaar-Yonkers
2011-10-28 20:50:08 UTC
Responsible Changed From-To: freebsd-bugs->freebsd-net reclassify. On Fri, Oct 28, 2011 at 07:43:28PM +0000, Frank Terhaar-Yonkers wrote: F> F> >Number: 162110 F> >Category: kern F> >Synopsis: Releng_9 panics on boot in IGB driver - regression from 8.2 F> >Confidential: no F> >Severity: critical F> >Priority: high F> >Responsible: freebsd-bugs F> >State: open F> >Quarter: F> >Keywords: F> >Date-Required: F> >Class: sw-bug F> >Submitter-Id: current-users F> >Arrival-Date: Fri Oct 28 19:50:08 UTC 2011 F> >Closed-Date: F> >Last-Modified: F> >Originator: Frank Terhaar-Yonkers F> >Release: Releng_9 CVSUP 2011-October-28 F> >Organization: F> Cisco F> >Environment: F> FreeBSD fty-zfs-01 9.0-RC1 FreeBSD 9.0-RC1 #1: Fri Oct 28 06:50:23 EDT 2011 toot@fty-zfs-01:/usr/obj/usr/src/sys/GENERIC amd64 F> >Description: F> if_igb driver panics during bootup. F> F> The IGB driver probes the device at line 591 of if_igb.c and punts: F> if (e1000_validate_nvm_checksum(&adapter->hw) < 0) { F> device_printf(dev, F> "The EEPROM Checksum Is Not Valid\n"); F> error = EIO; F> goto err_late; F> } F> F> The kernel immediately panics with a page fault. The trace-back show it's in the if_igb driver as the console messages suggest. F> F> Releng_8 did not panic, so this is a regression. The IGB NIC most likely has some sort of problem which is properly diagnosed. F> F> Email me if you want the screen shot of the panic, or have a fix to try out. To reproduce your problem, I've put '|| 1)' conditional into code quoted above. It appeared that calling igb_detach() in case of igb_attach() failure is full of landmines. Attached patch fixes lot of them, and at least kernel doesn't panic in case of e1000_validate_nvm_checksum() failure, not sure about other cases. Unfortunately patch will not fix your NIC, it only cures panic. I've put into Cc Jack Vogel, who is maintainer of the Intel NIC drivers in FreeBSD. May be he can help you. Jack, please consider including my patch into next version of driver. The issues fixed: - igb_detach() may be called with not initialized ifp - igb_stop() may be called with not initialized ifp - igb_detach() already does free transmit/receive structures - igb_detach() already does free adapter->mta - igb_detach() already does destroy core lock There are probably other edge cases, when kernel panics due to some failure in igb_attach(), not all possible error exits were tested. -- Totus tuus, Glebius. gleb made this patch quite a long time ago. The error/shutdown code is still broken. batch change: For bugs that match the following - Status Is In progress AND - Untouched since 2018-01-01. AND - Affects Base System OR Documentation DO: Reset to open status. Note: I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed. |