Bug 208339 - Early failure in ixl_attach causes kernel panic
Summary: Early failure in ixl_attach causes kernel panic
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 10.2-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: Eric Joyner
URL: https://reviews.freebsd.org/D5205
Keywords: IntelNetworking, crash
Depends on:
Blocks:
 
Reported: 2016-03-27 18:21 UTC by Mike Hibler
Modified: 2018-12-19 02:53 UTC (History)
3 users (show)

See Also:
koobs: mfc-stable10-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Hibler 2016-03-27 18:21:18 UTC
FreeBSD 10 Stable kernel with FreeBSD current dev/ixl driver gives:
----

ixl0: <Intel(R) Ethernet Connection XL710 Driver, Version - 1.4.3> mem 0x93000000-0x93ffffff,0x94018000-0x9401ffff irq 42 at device 0.0 on pci6
ixl0: Using MSIX interrupts with 9 vectors
ixl0: PF reset failure fffffff1


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x42
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff805be550
stack pointer           = 0x28:0xffffffff8424c0c0
frame pointer           = 0x28:0xffffffff8424c110
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (swapper)
trap number             = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
#0 0xffffffff809aa470 at kdb_backtrace+0x60
#1 0xffffffff8096d1c6 at vpanic+0x126
#2 0xffffffff8096d093 at panic+0x43
#3 0xffffffff80d71fcb at trap_fatal+0x36b
#4 0xffffffff80d722cd at trap_pfault+0x2ed
#5 0xffffffff80d7194a at trap+0x47a
#6 0xffffffff80d579a2 at calltrap+0x8
#7 0xffffffff805bf850 at ixl_attach+0xf00
#8 0xffffffff809a069d at device_attach+0x43d
#9 0xffffffff809a17dd at bus_generic_attach+0x2d
#10 0xffffffff80381c8c at acpi_pci_attach+0x15c
#11 0xffffffff809a069d at device_attach+0x43d
#12 0xffffffff809a17dd at bus_generic_attach+0x2d
#13 0xffffffff80383dbc at acpi_pcib_attach+0x22c
#14 0xffffffff8038501f at acpi_pcib_pci_attach+0x9f
#15 0xffffffff809a069d at device_attach+0x43d
#16 0xffffffff809a17dd at bus_generic_attach+0x2d
#17 0xffffffff80381c8c at acpi_pci_attach+0x15c
Uptime: 1s

----
The panic is actually in ixl_free_vsi because vsi->queues has not been initialized.
Note that ANY early error prior to:
----

        /* Set up VSI and queues */
        if (ixl_setup_stations(pf) != 0) {
                device_printf(dev, "setup stations failed!\n");
                error = ENOMEM;
                goto err_mac_hmc;
        }

        /* Initialize mac filter list for VSI */
        SLIST_INIT(&vsi->ftl);

----
is going to cause this panic. I am aware of base r295946 but that does not fix the problem.

The hack fix is:
----

*** if_ixl.c.orig	Sun Mar 27 12:17:28 2016
--- if_ixl.c	Sun Mar 27 11:41:43 2016
***************
*** 731,737 ****
  	i40e_shutdown_adminq(hw);
  err_out:
  	ixl_free_pci_resources(pf);
! 	ixl_free_vsi(vsi);
  	IXL_PF_LOCK_DESTROY(pf);
  	return (error);
  }
--- 731,738 ----
  	i40e_shutdown_adminq(hw);
  err_out:
  	ixl_free_pci_resources(pf);
! 	if (vsi->queues != NULL)
! 		ixl_free_vsi(vsi);
  	IXL_PF_LOCK_DESTROY(pf);
  	return (error);
  }
Comment 1 Jeff Pieper 2016-03-29 13:26:03 UTC
This was fixed in r208339: https://reviews.freebsd.org/D5205.

It still needs to be MFCed.
Comment 2 Jeff Pieper 2016-03-29 13:29:52 UTC
I meant r295946. We do not see this on 11-CURRENT.
Comment 3 Mike Hibler 2016-03-30 19:25:15 UTC
No, as I mentioned in my original report, r295946 does NOT fix the problem. 11-CURRENT crashes too.

You can simulate the failure by changing the driver:

Index: if_ixl.c
===================================================================
--- if_ixl.c    (revision 297415)
+++ if_ixl.c    (working copy)
@@ -550,7 +550,7 @@
        /* Establish a clean starting point */
        i40e_clear_hw(hw);
        error = i40e_pf_reset(hw);
-       if (error) {
+       if (1 || error) {
                device_printf(dev,"PF reset failure %x\n", error);
                error = EIO;
                goto err_out;

and then rebuilding and booting the kernel on a machine with an ixl interface.
Comment 4 Eric Joyner freebsd_committer 2018-12-19 00:20:43 UTC
This was fixed in 11.
Comment 5 Kubilay Kocak freebsd_committer freebsd_triage 2018-12-19 02:51:40 UTC
Assign to committer that resolved
Update resolution (Resolution by commit -> FIXED)