When multiple instances of pf are used on an "options VIMAGE" kernel, a kernel panic is quickly triggered. It happens when pf is trying to remove a state from its state table. Two indicative backtraces: #9 0xc0a127a4 in panic (fmt=0xc404ec26 "Bad link elm %p next->prev != elm") at /usr/src/sys/kern/kern_shutdown.c:587 #10 0xc4027099 in pf_free_state (cur=Variable "cur" is not available. ) at /usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:1655 #11 0xc4027110 in pf_purge_expired_states (maxcheck=1, waslocked=0) at /usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:1727 #12 0xc40285d0 in pf_purge_thread (v=0xc35a85a0) at /usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:1370 #13 0xc09e5818 in fork_exit (callout=0xc4028460 <pf_purge_thread>, arg=0xc35a85a0, frame=0xcd69ed28) at /usr/src/sys/kern/kern_fork.c:1025 #14 0xc0d72574 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:275 (kgdb) and #8 0xc0d724fc in calltrap () at /usr/src/sys/i386/i386/exception.s:168 #9 0xc40e776a in pf_state_tree_id_RB_REMOVE_COLOR (head=0xc4081148, parent=0x0, elm=0xc4183198) at /usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:474 #10 0xc40e7aa0 in pf_state_tree_id_RB_REMOVE (head=0xc4081148, elm=0xc417ce58) at /usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:474 #11 0xc40ec83e in pf_unlink_state (cur=0xc417ce58) at /usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:1592 #12 0xc40ed15a in pf_purge_expired_states (maxcheck=429496730, waslocked=0) at /usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:1717 #13 0xc40ee5d0 in pf_purge_thread (v=0xc35a8a00) at /usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:1370 #14 0xc09e5818 in fork_exit (callout=0xc40ee460 <pf_purge_thread>, arg=0xc35a8a00, frame=0xcd6ddd28) at /usr/src/sys/kern/kern_fork.c:1025 #15 0xc0d72574 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:275 (kgdb) Sometimes the state table corruption is observable using "vmstat -z": lab# vmstat -z | grep pfstate pfstatepl: 204, 10013,18446744073709551615, 1, 0, 0, 0 pfstatekeypl: 204, 0,18446744073709551615, 1, 0, 0, 0 pfstateitempl: 204, 0,18446744073709551615, 1, 0, 0, 0 pfstatescrub: 28, 0, 0, 0, 0, 0, 0 pfstatepl: 204, 10013,18446744073709551615, 1, 0, 0, 0 pfstatekeypl: 204, 0,18446744073709551615, 1, 0, 0, 0 pfstateitempl: 204, 0,18446744073709551615, 1, 0, 0, 0 pfstatescrub: 28, 0, 0, 0, 0, 0, 0 pfstatepl: 204, 10013, 4, 34, 4, 0, 0 pfstatekeypl: 204, 0, 4, 34, 4, 0, 0 pfstateitempl: 204, 0, 4, 34, 4, 0, 0 pfstatescrub: 28, 0, 0, 0, 0, 0, 0 pfstatepl: 204, 10013, 8, 30, 8, 0, 0 pfstatekeypl: 204, 0, 8, 30, 8, 0, 0 pfstateitempl: 204, 0, 8, 30, 8, 0, 0 pfstatescrub: 28, 0, 0, 0, 0, 0, 0 lab# Fix: There is a static non virtualized variable that causes the problem in pf_purge_expired_states():/sys/contrib/pf/net/pf.c. This should be virtualized in an "option VIMAGE" kernel. See the attached patch. Patch attached with submission follows: How-To-Repeat: build a kernel with the VIMAGE option create a few vnet jails kldload pf enable pf on all jails create a very basic ruleset like "pass out all\npass in all" on all jails create some IP traffic that will create pf states the kernel will soon panic on a state expiration
Responsible Changed From-To: freebsd-bugs->freebsd-pf Over to maintainer(s).
Responsible Changed From-To: freebsd-pf->freebsd-virtualization Move to vritualization where we aggregate VIMAGE PRs.
*** Bug 165252 has been marked as a duplicate of this bug. ***
*** Bug 182350 has been marked as a duplicate of this bug. ***
*** This bug has been marked as a duplicate of bug 194515 ***