Bug 239982

Summary: IPv6 network stack panics since upgrading to 11.3
Product: Base System Reporter: j.david.lists
Component: kernAssignee: freebsd-net (Nobody) <net>
Status: Closed FIXED    
Severity: Affects Only Me CC: kp
Priority: --- Keywords: crash, needs-qa, regression
Version: 11.3-RELEASE   
Hardware: amd64   
OS: Any   

Description j.david.lists 2019-08-20 02:49:54 UTC
Since upgrading from 11.2 to 11.3 (currently 11.3-RELEASE-p2), we are seeing panics in in6_setscope() on a regular basis.

The machine in question is a PF firewall handling IPv4 & IPv6, and it appears to be triggered by an incoming IPv6 fragment.  It's entirely possible that the fragment has been maliciously crafted to produce this effect.

The stack trace (always the same) is:

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 02
fault virtual address	= 0x5c
fault code		= supervisor read data, page not present
instruction pointer	= 0x20:0xffffffff80d6851c
stack pointer	        = 0x28:0xfffffe044ebf80a0
frame pointer	        = 0x28:0xfffffe044ebf80d0
code segment		= base rx0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 12 (irq265: ix0:q1)
trap number		= 12
panic: page fault
cpuid = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe044ebf7d50
vpanic() at vpanic+0x17e/frame 0xfffffe044ebf7db0
panic() at panic+0x43/frame 0xfffffe044ebf7e10
trap_fatal() at trap_fatal+0x369/frame 0xfffffe044ebf7e60
trap_pfault() at trap_pfault+0x49/frame 0xfffffe044ebf7ec0
trap() at trap+0x29d/frame 0xfffffe044ebf7fd0
calltrap() at calltrap+0x8/frame 0xfffffe044ebf7fd0
--- trap 0xc, rip = 0xffffffff80d6851c, rsp = 0xfffffe044ebf80a0, rbp = 0xfffffe044ebf80d0 ---
in6_setscope() at in6_setscope+0x9c/frame 0xfffffe044ebf80d0
ip6_forward() at ip6_forward+0x30b/frame 0xfffffe044ebf8220
pf_refragment6() at pf_refragment6+0x177/frame 0xfffffe044ebf82e0
pf_test6() at pf_test6+0xca2/frame 0xfffffe044ebf8470
pf_check6_out() at pf_check6_out+0x1d/frame 0xfffffe044ebf8490
pfil_run_hooks() at pfil_run_hooks+0x87/frame 0xfffffe044ebf8520
ip6_forward() at ip6_forward+0x405/frame 0xfffffe044ebf8670
ip6_input() at ip6_input+0xc69/frame 0xfffffe044ebf8760
netisr_dispatch_src() at netisr_dispatch_src+0xa2/frame 0xfffffe044ebf87b0
ether_demux() at ether_demux+0x129/frame 0xfffffe044ebf87e0
ether_nh_input() at ether_nh_input+0x337/frame 0xfffffe044ebf8840
netisr_dispatch_src() at netisr_dispatch_src+0xa2/frame 0xfffffe044ebf8890
ether_input() at ether_input+0x26/frame 0xfffffe044ebf88b0
ixgbe_rxeof() at ixgbe_rxeof+0x830/frame 0xfffffe044ebf8980
ixgbe_msix_que() at ixgbe_msix_que+0x99/frame 0xfffffe044ebf89e0
intr_event_execute_handlers() at intr_event_execute_handlers+0xe9/frame 0xfffffe044ebf8a20
ithread_loop() at ithread_loop+0xe7/frame 0xfffffe044ebf8a70
fork_exit() at fork_exit+0x83/frame 0xfffffe044ebf8ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe044ebf8ab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---

This machine is in a CARP setup with an 11.2 machine, so we can confirm the problem appears specific to 11.3; the 11.3 machine is the primary master so it will crash, the 11.2 machine will take over, the 11.3 boots back up (which takes 6-8 minutes) and takes back over, then frequently crashes again within a minute or so.
Comment 1 Kristof Provost freebsd_committer freebsd_triage 2019-08-20 06:48:29 UTC
Please add your pf ruleset to the bug report.
Comment 2 j.david.lists 2019-08-21 14:57:54 UTC
Retesting with the FreeBSD-SA-19:22.mbuf fix released yesterday, which seems highly likely to be related.  Will update if the problem reoccurs.  (Shouldn't take long.)  Have set a note to close issue in a week if problem does not reoccur.
Comment 3 j.david.lists 2019-08-28 13:32:34 UTC
Closing this as the FreeBSD-SA-19:22.mbuf fix appears to have completely resolved the issue.  (No crashes in 7 days.)