Created attachment 149513 [details] crashdump details This panic occured in 8.4-STABLE for me. I know stable/8 approaches it EOL but the HEAD's code in question is the same so I believe the bug is not fixed there too. Sometimes I have similar kernel panics with my mpd5/PPPoE access server acting as traffic shaper using dummynet and io_fast enabled. This time I've got good crashdump and spent some time reading the code and I believe the culprit is dummynet_send() function from ipfw/ip_dn_io.c Kgdb backtrace and additional info are attached. Here is part of kernel log: Nov 17 17:02:21 m-19-pc-2 kernel: dummynet: bad switch -256! Nov 17 17:02:21 m-19-pc-2 kernel: Nov 17 17:02:21 m-19-pc-2 kernel: Nov 17 17:02:21 m-19-pc-2 kernel: Fatal trap 12: page fault while in kernel mode Nov 17 17:02:21 m-19-pc-2 kernel: cpuid = 0; apic id = 00 Nov 17 17:02:21 m-19-pc-2 kernel: fault virtual address = 0x1 Nov 17 17:02:21 m-19-pc-2 kernel: fault code = supervisor read instruction, page not present Nov 17 17:02:21 m-19-pc-2 kernel: instruction pointer = 0x20:0x1 Nov 17 17:02:21 m-19-pc-2 kernel: stack pointer = 0x28:0xffffff8122b0ba20 As one can see from dummynet_send() code, "bad switch" in the log means that tag = m_tag_first(m) was not NULL at the moment of the check. However, kgdb shows (see attachment) that is was NULL at the moment of kernel panic. It seems for me we have some kind of race here, so the mbuf is processed and freed in between of these two moments and UMA panices due to double free attempt. I see no protection from this kind of race. The box has 4 CPU cores (hyperthreading disabled) and these tunnables enabled: net.isr.bindthreads=1 net.isr.maxthreads=4 net.inet.ip.fastforwarding=1 net.inet.ip.dummynet.pipe_slot_limit=1000 net.inet.ip.dummynet.io_fast=1 sysctls net.isr.direct and net.isr.direct_force are 1 by default
kernel.debug and vmcore file are available for download, 189MB total: http://www.grosbein.net/freebsd/crash/20141117/kernel.debug.xz http://www.grosbein.net/freebsd/crash/20141117/vmcore.6.xz
Also, the box has systctl kern.ipc.nmbclusters=400000 and about 17% of them was utilized just before panic, as per output of: vmstat -z | awk 'BEGIN { FS=","; OFMT="%.0f"; } /mbuf_cluster:/ { print $3*100/$2 }'
*** This bug has been marked as a duplicate of bug 220078 ***