Bug 240610 - iflib: Panic with INVARIANTS: general protection fault when kldunload'ing (12.1-pre-QA)
Summary: iflib: Panic with INVARIANTS: general protection fault when kldunload'ing (12...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.0-STABLE
Hardware: Any Any
: --- Affects Only Me
Assignee: Eric Joyner
URL:
Keywords: crash
Depends on:
Blocks: 240700
  Show dependency treegraph
 
Reported: 2019-09-16 08:40 UTC by Harald Schmalzbauer
Modified: 2019-10-01 02:23 UTC (History)
4 users (show)

See Also:
koobs: mfc-stable11-
erj: mfc-stable12+


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Harald Schmalzbauer 2019-09-16 08:40:38 UTC
Hello,

this panic happens when I kldunload if_igb(4) on 12.1-prerelease with debug kernel:

Fatal trap 9: general protection fault while in kernel mode                                                                             
cpuid = 1; apic id = 01                                                                                                                
instruction pointer     = 0x20:0xffffffff80613313
stack pointer           = 0x28:0xfffffe00005e1710                                                                                      
frame pointer           = 0x28:0xfffffe00005e1740                                                                                      
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1                                                                       
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 1941 (kldunload)
trap number             = 9                                                                                                            
panic: general protection fault
cpuid = 1
time = 1568622439                                                                                                                      
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00005e1420                                                         
vpanic() at vpanic+0x19d/frame 0xfffffe00005e1470                                                                                      
panic() at panic+0x43/frame 0xfffffe00005e14d0
trap_fatal() at trap_fatal+0x39c/frame 0xfffffe00005e1530
trap() at trap+0x6c/frame 0xfffffe00005e1640                                                                                           
calltrap() at calltrap+0x8/frame 0xfffffe00005e1640
--- trap 0x9, rip = 0xffffffff80613313, rsp = 0xfffffe00005e1710, rbp = 0xfffffe00005e1740 ---                                         
_eventhandler_deregister() at _eventhandler_deregister+0x133/frame 0xfffffe00005e1740                                                  
iflib_deregister() at iflib_deregister+0x44/frame 0xfffffe00005e1760
iflib_device_deregister() at iflib_device_deregister+0x347/frame 0xfffffe00005e17b0                                                    
device_detach() at device_detach+0x185/frame 0xfffffe00005e17f0                                                                        
devclass_driver_deleted() at devclass_driver_deleted+0x4f/frame 0xfffffe00005e1830                                                     
devclass_delete_driver() at devclass_delete_driver+0x9d/frame 0xfffffe00005e1870                                                       
driver_module_handler() at driver_module_handler+0x10f/frame 0xfffffe00005e18c0                                                        
module_unload() at module_unload+0x32/frame 0xfffffe00005e18e0                                                                         
linker_file_unload() at linker_file_unload+0x21b/frame 0xfffffe00005e1940                                                              
kern_kldunload() at kern_kldunload+0x10d/frame 0xfffffe00005e1980                                                                      
amd64_syscall() at amd64_syscall+0x276/frame 0xfffffe00005e1ab0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe00005e1ab0                                                            
--- syscall (444, FreeBSD ELF64, sys_kldunloadf), rip = 0x8002db98a, rsp = 0x7fffffffe198, rbp = 0x7fffffffe9f0 ---                    
KDB: enter: panic

#9  0xffffffff805cf4ca in vpanic (fmt=<value optimized out>, ap=<value optimized out>)
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/kern/kern_shutdown.c:866
#10 0xffffffff805cf273 in panic (fmt=<value optimized out>) at /usr/local/share/deploy-tools/RELENG_12/src/sys/kern/kern_shutdown.c:804
#11 0xffffffff8093a0bc in trap_fatal (frame=<value optimized out>, eva=<value optimized out>)
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/amd64/amd64/trap.c:943
#12 0xffffffff809394bc in trap (frame=0xfffffe00005e1650) at RELENG_12/src/sys/amd64/include/counter.h:87
#13 0xffffffff80911c2c in calltrap () at /usr/local/share/deploy-tools/RELENG_12/src/sys/amd64/amd64/exception.S:289
#14 0xffffffff80613313 in _eventhandler_deregister (list=0xfffff8000295eb80, tag=0xfffff80002517600)
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/kern/subr_eventhandler.c:198
#15 0xffffffff806fc524 in iflib_deregister (ctx=0xfffff800023e3800) at /usr/local/share/deploy-tools/RELENG_12/src/sys/net/iflib.c:5331
#16 0xffffffff806fd427 in iflib_device_deregister (ctx=<value optimized out>)
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/net/iflib.c:5069
#17 0xffffffff80605ac5 in device_detach (dev=0xfffff80002951300) at device_if.h:234
#18 0xffffffff8060502f in devclass_driver_deleted (busclass=0xfffff800023eca80, dc=0xfffff80002884000, driver=0xffffffff81f58418)
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/kern/subr_bus.c:1227
#19 0xffffffff80604f3d in devclass_delete_driver (busclass=0xfffff800023eca80, driver=0xffffffff81f58418)
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/kern/subr_bus.c:1302
#20 0xffffffff8060addf in driver_module_handler (mod=0xfffff800024dd900, what=1, arg=0xffffffff81f583e8)
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/kern/subr_bus.c:5172
#21 0xffffffff805b3ee2 in module_unload (mod=0xfffff800024dd900)
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/kern/kern_module.c:261
#22 0xffffffff805a65bb in linker_file_unload (file=0xfffff8000380b400, flags=-1)
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/kern/kern_linker.c:697
#23 0xffffffff805a790d in kern_kldunload (td=<value optimized out>, fileid=13, flags=0)
    at /usr/local/share/deploy-tools/RELENG_12/src/sys/kern/kern_linker.c:1132
---Type <return> to continue, or q <return> to quit---
#24 0xffffffff8093abe6 in amd64_syscall (td=0xfffff80003cc7000, traced=0)
    at RELENG_12/src/sys/amd64/amd64/../../kern/subr_syscall.c:135
#25 0xffffffff80912550 in fast_syscall_common () at /usr/local/share/deploy-tools/RELENG_12/src/sys/amd64/amd64/exception.S:581
#26 0x00000008002db98a in ?? ()
Previous frame inner to this frame (corrupt stack?)

In one of my setups, I depend on unloading if_igb(4) at runtime, so I hope this can be fixed without too much hassle.  Like mentioned, at least in my case this isn't cosemtics only.

Thanks,

-Harry
Comment 1 Harald Schmalzbauer 2019-09-24 17:30:48 UTC
Is https://reviews.freebsd.org/D21711 supposed to address this issue? Guess so, will test.
Comment 2 Eric Joyner freebsd_committer 2019-09-24 18:20:12 UTC
Yeah, it looks like it.

Sorry, the original review didn't have the PR number, but it did fix a problem we were encountering internally at Intel.
Comment 3 Harald Schmalzbauer 2019-09-24 18:30:50 UTC
(In reply to Eric Joyner from comment #2)

Thanks, meanwhile I can confirm that https://svnweb.freebsd.org/changeset/base/352655 fixes the issue for me on 12.1-BETA1.

I'd like to take the chance and point to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=240609
This seems to be a major issue, although not affecting non-debug kernels, so I don't add bug 240700 as blocker.  Should I?

Do you take over this report and close after MFC+MFSing?

Thanks,

-harry
Comment 4 Eric Joyner freebsd_committer 2019-09-24 18:39:23 UTC
Yeah. I'll MFC the fix after a bit and then close it.
Comment 5 Eric Joyner freebsd_committer 2019-09-30 18:24:36 UTC
This fix has been merged into stable/12 (r352910) and releng/12.1 (r352912).