Bug 237655 - Non-deterministic panic when running pf tests in interface ioctl code (NULL passed to strncmp)
Summary: Non-deterministic panic when running pf tests in interface ioctl code (NULL p...
Status: Closed Unable to Reproduce
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-net (Nobody)
URL:
Keywords: needs-qa
Depends on:
Blocks:
 
Reported: 2019-04-29 19:42 UTC by Enji Cooper
Modified: 2019-08-14 22:56 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Enji Cooper freebsd_committer freebsd_triage 2019-04-29 19:42:25 UTC
The last few test runs have been failing as follows, with panics in strncmp() managing (what I assume is epair or tun) interfaces via the ioctl handler.

There are some questionable LoR messages also printed out in the pf code about lock holding, but this is likely not the root cause.

From https://ci.freebsd.org/job/FreeBSD-head-amd64-test/11014/consoleText :

sys/netpfil/pf/pfsync:basic  ->  lock order reversal: (sleepable after non-sleepable)
 1st 0xfffff8003a43f070 pfsync (pfsync) @ /usr/src/sys/netpfil/pf/if_pfsync.c:1402
 2nd 0xffffffff820c48c0 in_multi_sx (in_multi_sx) @ /usr/src/sys/netinet/in_mcast.c:1251
stack backtrace:
#0 0xffffffff80c47773 at witness_debugger+0x73
#1 0xffffffff80c474bd at witness_checkorder+0xa7d
#2 0xffffffff80be7ed8 at _sx_xlock+0x68
#3 0xffffffff80d65271 at in_joingroup+0x31
#4 0xffffffff82839086 at pfsyncioctl+0x6e6
#5 0xffffffff80d60116 at in_control+0x376
#6 0xffffffff80ce168b at ifioctl+0x57b
#7 0xffffffff80c4c6ba at kern_ioctl+0x28a
#8 0xffffffff80c4c3bd at sys_ioctl+0x15d
#9 0xffffffff810b2e16 at amd64_syscall+0x276
#10 0xffffffff8108b5fd at fast_syscall_common+0x101
passed  [2.316s]
sys/netpfil/pf/pfsync:defer  ->  passed  [2.248s]
sys/netpfil/pf/rdr:basic  ->  Apr 29 18:28:59  kernel: nd6_dad_timer: cancel DAD on epair3a because of ND6_IFF_IFDISABLED.

passed  [4.153s]
sys/netpfil/pf/route_to:v4  ->  passed  [3.195s]
sys/netpfil/pf/route_to:v6  ->  Apr 29 18:29:07  kernel: nd6_dad_timer: called with non-tentative address fe80:7::39:3ff:fe4c:500a(epair4a)

Apr 29 18:29:07  kernel: nd6_dad_timer: called with non-tentative address fe80:3::35:96ff:fe61:640b(epair3b)

Apr 29 18:29:07  kernel: nd6_dad_timer: called with non-tentative address fe80:4::39:3ff:fe4c:500b(epair4b)

Apr 29 18:29:07  kernel: nd6_dad_timer: called with non-tentative address fe80:5::35:96ff:fe61:640a(epair3a)

passed  [3.181s]
sys/netpfil/pf/set_skip:set_skip_group  ->  passed  [0.097s]
sys/netpfil/pf/set_skip:set_skip_group_lo  ->  passed  [0.113s]
sys/netpfil/pf/set_tos:v4  ->  passed  [8.702s]
sys/netpfil/pf/synproxy:synproxy  ->  passed  [0.161s]


Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer	= 0x20:0xffffffff80ccb525
stack pointer	        = 0x28:0xfffffe0030ec9740
frame pointer	        = 0x28:0xfffffe0030ec9740
code segment		= base rx0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 69669 (ifconfig)
trap number		= 9
panic: general protection fault
cpuid = 0
time = 1556562559
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0030ec9450
vpanic() at vpanic+0x19d/frame 0xfffffe0030ec94a0
panic() at panic+0x43/frame 0xfffffe0030ec9500
trap_fatal() at trap_fatal+0x394/frame 0xfffffe0030ec9560
trap() at trap+0x6c/frame 0xfffffe0030ec9670
calltrap() at calltrap+0x8/frame 0xfffffe0030ec9670
--- trap 0x9, rip = 0xffffffff80ccb525, rsp = 0xfffffe0030ec9740, rbp = 0xfffffe0030ec9740 ---
strncmp() at strncmp+0x15/frame 0xfffffe0030ec9740
ifunit_ref() at ifunit_ref+0x51/frame 0xfffffe0030ec9780
ifioctl() at ifioctl+0x508/frame 0xfffffe0030ec9850
kern_ioctl() at kern_ioctl+0x28a/frame 0xfffffe0030ec98c0
sys_ioctl() at sys_ioctl+0x15d/frame 0xfffffe0030ec9990
amd64_syscall() at amd64_syscall+0x276/frame 0xfffffe0030ec9ab0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0030ec9ab0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x80048531a, rsp = 0x7fffffffe458, rbp = 0x7fffffffe4c0 ---
KDB: enter: panic
[ thread pid 69669 tid 100143 ]
Stopped at      kdb_enter+0x3b: movq    $0,kdb_why
db:0:kdb.enter.panic> show pcpu
cpuid        = 0
dynamic pcpu = 0xb84800
curthread    = 0xfffff80004f305a0: pid 69669 tid 100143 "ifconfig"
curpcb       = 0xfffffe0030ec9b80
fpcurthread  = 0xfffff80004f305a0: pid 69669 "ifconfig"
idlethread   = 0xfffff80003272000: tid 100003 "idle: cpu0"
curpmap      = 0xfffff8003e6fb130
tssp         = 0xffffffff821cd320
commontssp   = 0xffffffff821cd320
rsp0         = 0xfffffe0030ec9b80
gs32p        = 0xffffffff821d3f58
ldt          = 0xffffffff821d3f98
tss          = 0xffffffff821d3f88
tlb gen      = 455364
curvnet      = 0xfffff8000307aec0
spin locks held:
db:0:kdb.enter.panic> alltrace

Tracing command ifconfig pid 69669 tid 100143 td 0xfffff80004f305a0 (CPU 0)
kdb_enter() at kdb_enter+0x3b/frame 0xfffffe0030ec9450
vpanic() at vpanic+0x1ba/frame 0xfffffe0030ec94a0
panic() at panic+0x43/frame 0xfffffe0030ec9500
trap_fatal() at trap_fatal+0x394/frame 0xfffffe0030ec9560
trap() at trap+0x6c/frame 0xfffffe0030ec9670
calltrap() at calltrap+0x8/frame 0xfffffe0030ec9670
--- trap 0x9, rip = 0xffffffff80ccb525, rsp = 0xfffffe0030ec9740, rbp = 0xfffffe0030ec9740 ---
strncmp() at strncmp+0x15/frame 0xfffffe0030ec9740
ifunit_ref() at ifunit_ref+0x51/frame 0xfffffe0030ec9780
ifioctl() at ifioctl+0x508/frame 0xfffffe0030ec9850
kern_ioctl() at kern_ioctl+0x28a/frame 0xfffffe0030ec98c0
sys_ioctl() at sys_ioctl+0x15d/frame 0xfffffe0030ec9990
amd64_syscall() at amd64_syscall+0x276/frame 0xfffffe0030ec9ab0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0030ec9ab0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x80048531a, rsp = 0x7fffffffe458, rbp = 0x7fffffffe4c0 ---
Comment 1 Kristof Provost freebsd_committer freebsd_triage 2019-06-04 11:17:20 UTC
I can't seem to reproduce this with current versions. It also doesn't appear to have affected any of the recent ci.freebsd.org test runs.

I've reassigned this to net@, because it's not a pf problem, merely exposed by the pf tests.
Comment 2 Li-Wen Hsu freebsd_committer freebsd_triage 2019-08-14 22:56:25 UTC
I also don't see this panic after last update. Let's close this one and reopen when we see this panic again in the CI system.