I observe very seldom kernel panics of my home router that runs FreeBSD 11.3-STABLE/amd64 r356315, once per several months. It paniced again today and I've got nice crashdump. The router uses custom kernel with following config file: include GENERIC ident GW options DDB options DDB_NUMSYM options ALT_BREAK_TO_DEBUGGER #EOF The router processes several IPSec tunnels and some volume of fragmented ESP packets. The router uses ipfw and it has the following rule: reass ip from any to any in { recv ng0 or recv em0 or recv wlan* } kgdb session follows: Unread portion of the kernel message buffer: __curthread () at ./machine/pcpu.h:234 234 __asm("movq %%gs:%1,%0" : "=r" (td) (kgdb) bt #0 __curthread () at ./machine/pcpu.h:234 #1 doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:320 #2 0xffffffff80b2212d in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:388 #3 0xffffffff80b22578 in vpanic (fmt=<optimized out>, ap=0xfffffe022b5ed470) at /usr/src/sys/kern/kern_shutdown.c:784 #4 0xffffffff80b223b3 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:715 #5 0xffffffff80fb8d00 in trap_fatal (frame=0xfffffe022b5ed660, eva=952) at /usr/src/sys/amd64/amd64/trap.c:899 #6 0xffffffff80fb8d49 in trap_pfault (frame=0xfffffe022b5ed660, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:744 #7 0xffffffff80fb83dd in trap (frame=0xfffffe022b5ed660) at /usr/src/sys/amd64/amd64/trap.c:438 #8 <signal handler called> #9 __mtx_lock_sleep (c=0xffffffff81e57188 <ipq+45624>, v=<optimized out>) at /usr/src/sys/kern/kern_mutex.c:563 #10 0xffffffff80ca1078 in ipreass_slowtimo () at /usr/src/sys/netinet/ip_reass.c:573 #11 0xffffffff80baa504 in pfslowtimo (arg=0xffffffff81e57188 <ipq+45624>) at /usr/src/sys/kern/uipc_domain.c:506 #12 0xffffffff80b3acbf in softclock_call_cc ( --Type <RET> for more, q to quit, c to continue without paging--c c=0xffffffff81e46200 <pfslow_callout>, cc=0xffffffff81efe000 <cc_cpu>, direct=0) at /usr/src/sys/kern/kern_timeout.c:729 #13 0xffffffff80b3b1b9 in softclock (arg=0xffffffff81efe000 <cc_cpu>) at /usr/src/sys/kern/kern_timeout.c:867 #14 0xffffffff80ae7119 in intr_event_execute_handlers (p=<optimized out>, ie=0xfffff80005240200) at /usr/src/sys/kern/kern_intr.c:1346 #15 0xffffffff80ae7807 in ithread_execute_handlers (p=<optimized out>, ie=<optimized out>) at /usr/src/sys/kern/kern_intr.c:1359 #16 ithread_loop (arg=0xfffff80005226680) at /usr/src/sys/kern/kern_intr.c:1440 #17 0xffffffff80ae44c3 in fork_exit (callout=0xffffffff80ae7720 <ithread_loop>, arg=0xfffff80005226680, frame=0xfffffe022b5ed9c0) at /usr/src/sys/kern/kern_fork.c:1086 #18 <signal handler called> (kgdb) frame 10 #10 0xffffffff80ca1078 in ipreass_slowtimo () at /usr/src/sys/netinet/ip_reass.c:573 573 IPQ_LOCK(i); (kgdb) l 568 ipreass_slowtimo(void) 569 { 570 struct ipq *fp, *tmp; 571 572 for (int i = 0; i < IPREASS_NHASH; i++) { 573 IPQ_LOCK(i); 574 TAILQ_FOREACH_SAFE(fp, &V_ipq[i].head, ipq_list, tmp) 575 if (--fp->ipq_ttl == 0) 576 ipq_timeout(&V_ipq[i], fp); 577 IPQ_UNLOCK(i); (kgdb) p i $1 = 814 (kgdb) frame 9 #9 __mtx_lock_sleep (c=0xffffffff81e57188 <ipq+45624>, v=<optimized out>) at /usr/src/sys/kern/kern_mutex.c:563 563 if (TD_IS_RUNNING(owner)) { (kgdb) l 558 /* 559 * If the owner is running on another CPU, spin until the 560 * owner stops running or the state of the lock changes. 561 */ 562 owner = lv_mtx_owner(v); 563 if (TD_IS_RUNNING(owner)) { 564 if (LOCK_LOG_TEST(&m->lock_object, 0)) 565 CTR3(KTR_LOCK, 566 "%s: spinning on %p held by %p", 567 __func__, m, owner); (kgdb) p ipq[45624] $2 = {head = {tqh_first = 0xffffffff80b1d0c0 <_rm_wlock>, tqh_last = 0x2fb}, lock = {lock_object = { lo_name = 0x5001200125dd1 <error: Cannot access memory at address 0x5001200125dd1>, lo_flags = 2155235392, lo_data = 4294967295, lo_witness = 0x5eb}, mtx_lock = 1407452194168295}, count = -2139146832} (kgdb) p owner $3 = (struct thread *) 0x0 (kgdb) p *m $4 = {lock_object = {lo_name = 0xffffffff81567e78 "IP reassembly", lo_flags = 21168128, lo_data = 0, lo_witness = 0x0}, mtx_lock = 2} (kgdb) p v $5 = <optimized out> (kgdb) quit
The CPU is two-core Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz, if it matters.
Also, this system does not clear RAM between reboots, so dmesg buffer survives panic and saved to the log after reboot: Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 02 fault virtual address = 0x3b8 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80b024ad stack pointer = 0x28:0xfffffe022b5ed720 frame pointer = 0x28:0xfffffe022b5ed7a0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 12 (swi4: clock (0)) trap number = 12 panic: page fault cpuid = 1 KDB: stack backtrace: db_trace_self_wrapper() at 0xffffffff803b4bbb = db_trace_self_wrapper+0x2b/frame 0xfffffe022b5ed3d0 vpanic() at 0xffffffff80b2253e = vpanic+0x17e/frame 0xfffffe022b5ed430 panic() at 0xffffffff80b223b3 = panic+0x43/frame 0xfffffe022b5ed490 trap_pfault() at 0xffffffff80fb8d00 = trap_pfault/frame 0xfffffe022b5ed4e0 trap_pfault() at 0xffffffff80fb8d49 = trap_pfault+0x49/frame 0xfffffe022b5ed540 trap() at 0xffffffff80fb83dd = trap+0x29d/frame 0xfffffe022b5ed650 calltrap() at 0xffffffff80f979e3 = calltrap+0x8/frame 0xfffffe022b5ed650 --- trap 0xc, rip = 0xffffffff80b024ad, rsp = 0xfffffe022b5ed720, rbp = 0xfffffe022b5ed7a0 --- __mtx_lock_sleep() at 0xffffffff80b024ad = __mtx_lock_sleep+0xbd/frame 0xfffffe022b5ed7a0 ipreass_slowtimo() at 0xffffffff80ca1078 = ipreass_slowtimo+0x28/frame 0xfffffe022b5ed7e0 pfslowtimo() at 0xffffffff80baa504 = pfslowtimo+0x54/frame 0xfffffe022b5ed810 softclock_call_cc() at 0xffffffff80b3acbf = softclock_call_cc+0x14f/frame 0xfffffe022b5ed8c0 softclock() at 0xffffffff80b3b1b9 = softclock+0x79/frame 0xfffffe022b5ed8e0 intr_event_execute_handlers() at 0xffffffff80ae7119 = intr_event_execute_handlers+0xe9/frame 0xfffffe022b5ed920 ithread_loop() at 0xffffffff80ae7807 = ithread_loop+0xe7/frame 0xfffffe022b5ed970 fork_exit() at 0xffffffff80ae44c3 = fork_exit+0x83/frame 0xfffffe022b5ed9b0 fork_trampoline() at 0xffffffff80f989fe = fork_trampoline+0xe/frame 0xfffffe022b5ed9b0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- KDB: enter: panic Uptime: 24d16h23m51s
The lockname should always be valid for the mutex. Else this may be a memory corruption issue.
Do you know if the router was about to reboot when this happened?
(In reply to Hans Petter Selasky from comment #4) Sorry for late reply, I've missed your question. AFAIR, the router was not rebooting. However, it has USB-based Wifi card run0 in station mode with unstable connection, so corresponding run0/wlanX kernel objects could be destroyed/re-created any moment.
Believed to be fixed with MFC r359746 to stable/11 as the problem did not repeat since then. The hardware now runs stable/12 that did not reproduce the problem, too.