I'm not 100% sure that these panics are from r338306. They are from an ftp mirror and pkg mirror, both induced from unbound doing its ipv6 udp thing. ----------- Fatal trap 12: page fault while in kernel mode cpuid = 4; apic id = 32 fault virtual address = 0x68 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80a5380b stack pointer = 0x28:0xfffffe0141068500 frame pointer = 0x28:0xfffffe0141068750 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 38582 (unbound) [ thread pid 38582 tid 100859 ] Stopped at ip6_output+0x18db: movq ll+0x47(%rcx),%rcx db> bt Tracing pid 38582 tid 100859 td 0xfffff8010988b000 ip6_output() at ip6_output+0x18db/frame 0xfffffe0141068750 udp6_send() at udp6_send+0x7f8/frame 0xfffffe0141068910 sosend_dgram() at sosend_dgram+0x33b/frame 0xfffffe0141068980 kern_sendit() at kern_sendit+0x237/frame 0xfffffe0141068a20 sendit() at sendit+0x19e/frame 0xfffffe0141068a70 sys_sendto() at sys_sendto+0x4d/frame 0xfffffe0141068ac0 amd64_syscall() at amd64_syscall+0x369/frame 0xfffffe0141068bf0 fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0141068bf0 --- syscall (133, FreeBSD ELF64, sys_sendto), rip = 0x800836efa, rsp = 0x7fffdf9fa988, rbp = 0x7fffdf9fa9d0 --- -------- Fatal trap 12: page fault while in kernel mode cpuid = 2; apic id = 02 fault virtual address = 0x7c fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80a4d344 stack pointer = 0x28:0xfffffe0094a36490 frame pointer = 0x28:0xfffffe0094a364d0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 1247 (unbound) [ thread pid 1247 tid 100907 ] Stopped at selectroute+0x194: movl ll+0x5b(%rdi),%eax db> db> bt Tracing pid 1247 tid 100907 td 0xfffff8000d131580 selectroute() at selectroute+0x194/frame 0xfffffe0094a364d0 in6_selectroute_fib() at in6_selectroute_fib+0xf/frame 0xfffffe0094a364f0 ip6_output() at ip6_output+0xfc4/frame 0xfffffe0094a36750 udp6_send() at udp6_send+0x7f8/frame 0xfffffe0094a36910 sosend_dgram() at sosend_dgram+0x33b/frame 0xfffffe0094a36980 kern_sendit() at kern_sendit+0x237/frame 0xfffffe0094a36a20 sendit() at sendit+0x19e/frame 0xfffffe0094a36a70 sys_sendto() at sys_sendto+0x4d/frame 0xfffffe0094a36ac0 amd64_syscall() at amd64_syscall+0x369/frame 0xfffffe0094a36bf0 fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0094a36bf0 --- syscall (133, FreeBSD ELF64, sys_sendto), rip = 0x800836efa, rsp = 0x7fffdfdfcce8, rbp = 0x7fffdfdfcd30 ---
This is from: FreeBSD pkg-mirror0.isc.freebsd.org 12.0-ALPHA3 FreeBSD 12.0-ALPHA3 #0 r338330: Sun Aug 26 19:06:44 UTC 2018 peter@build-12.freebsd.org:/usr/obj/usr/src/sys/CLUSTER12 amd64
I'll have a look.
*** Bug 231067 has been marked as a duplicate of this bug. ***
I am working on patches for this. Sorry life got in the way a bit. Also I had to dig quite a bit and found a few "cosmetical and functional" bits where people have simply ignored netinet6 again the last two years, so trying to close that gap along. I hope to have things for test to all of you within 24 hours.
In case any of you has a core file from a previous panic, could you please go and see if the so_pcb from the so passed to udp6_output() looks valid, and especially if you could tell me the %#x of inp_flags2 that would be appreciated. Also if you want to test this one, I'd be curious if it holds up. In an isolated lab setup, I have so far not been able to reproduce the race/panic myself. diff --git a/sys/netinet/in_pcb.c b/sys/netinet/in_pcb.c index 98838fa041a..f24ad42f9f4 100644 --- a/sys/netinet/in_pcb.c +++ b/sys/netinet/in_pcb.c @@ -1596,6 +1596,7 @@ in_pcbfree_deferred(epoch_context_t ctx) #endif if (inp->inp_options) (void)m_free(inp->inp_options); + RO_INVALIDATE_CACHE(&inp->inp_route); inp->inp_vflag = 0; crfree(inp->inp_cred); #ifdef MAC @@ -1644,7 +1645,6 @@ in_pcbfree(struct inpcb *inp) INP_LIST_WLOCK(pcbinfo); in_pcbremlists(inp); INP_LIST_WUNLOCK(pcbinfo); - RO_INVALIDATE_CACHE(&inp->inp_route); /* mark as destruction in progress */ inp->inp_flags2 |= INP_FREED; INP_WUNLOCK(inp);
(In reply to Bjoern A. Zeeb from comment #5) Ignore the diff; no need to try! This has other implications, with epoch(9).
Ok, 2nd try; I still cannot reproduce the ip6_output() panics. If you've got a few spare minutes try this one; it at least survives the tests I can do at the moment. diff --git a/sys/netinet/in_pcb.c b/sys/netinet/in_pcb.c index 98838fa041a..95ec83dd4d4 100644 --- a/sys/netinet/in_pcb.c +++ b/sys/netinet/in_pcb.c @@ -1596,6 +1596,9 @@ in_pcbfree_deferred(epoch_context_t ctx) #endif if (inp->inp_options) (void)m_free(inp->inp_options); + CURVNET_SET(inp->inp_vnet); + RO_INVALIDATE_CACHE(&inp->inp_route); + CURVNET_RESTORE(); inp->inp_vflag = 0; crfree(inp->inp_cred); #ifdef MAC @@ -1644,7 +1647,6 @@ in_pcbfree(struct inpcb *inp) INP_LIST_WLOCK(pcbinfo); in_pcbremlists(inp); INP_LIST_WUNLOCK(pcbinfo); - RO_INVALIDATE_CACHE(&inp->inp_route); /* mark as destruction in progress */ inp->inp_flags2 |= INP_FREED; INP_WUNLOCK(inp);
Doesn't seem to help. This is r338450M with the patch in Comment 7 applied. Kernel was activated at 01:58 boot time Tue Sep 4 08:46 boot time Tue Sep 4 06:08 boot time Tue Sep 4 05:21 boot time Tue Sep 4 03:04 boot time Tue Sep 4 01:58 kernel: Fatal trap 12: page fault while in kernel mode kernel: cpuid = 7; apic id = 07 kernel: fault virtual address = 0x98 kernel: fault code = supervisor read data, page not present kernel: instruction pointer = 0x20:0xffffffff80e26546 kernel: stack pointer = 0x28:0xfffffe00ce0b1270 kernel: frame pointer = 0x28:0xfffffe00ce0b14d0 kernel: code segment = base rx0, limit 0xfffff, type 0x1b kernel: = DPL 0, pres 1, long 1, def32 0, gran 1 kernel: processor eflags = interrupt enabled, resume, IOPL = 0 kernel: current process = 12303 (isc-worker0006) kernel: trap number = 12 kernel: panic: page fault kernel: cpuid = 7 kernel: time = 1536043483 kernel: KDB: stack backtrace: kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00ce0b0f20 kernel: vpanic() at vpanic+0x1a3/frame 0xfffffe00ce0b0f80 kernel: panic() at panic+0x43/frame 0xfffffe00ce0b0fe0 kernel: trap_fatal() at trap_fatal+0x35f/frame 0xfffffe00ce0b1030 kernel: trap_pfault() at trap_pfault+0x49/frame 0xfffffe00ce0b1090 kernel: trap() at trap+0x29e/frame 0xfffffe00ce0b11a0 kernel: calltrap() at calltrap+0x8/frame 0xfffffe00ce0b11a0 kernel: --- trap 0xc, rip = 0xffffffff80e26546, rsp = 0xfffffe00ce0b1270, rbp = 0xfffffe00ce0b14d0 --- kernel: ip6_output() at ip6_output+0xef6/frame 0xfffffe00ce0b14d0 kernel: udp6_send() at udp6_send+0x86a/frame 0xfffffe00ce0b1690 kernel: sosend_dgram() at sosend_dgram+0x33b/frame 0xfffffe00ce0b1700 kernel: sosend() at sosend+0x50/frame 0xfffffe00ce0b1730 kernel: kern_sendit() at kern_sendit+0x237/frame 0xfffffe00ce0b17d0 kernel: sendit() at sendit+0x19e/frame 0xfffffe00ce0b1820 kernel: sys_sendmsg() at sys_sendmsg+0x61/frame 0xfffffe00ce0b1880 kernel: amd64_syscall() at amd64_syscall+0x369/frame 0xfffffe00ce0b19b0 kernel: fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe00ce0b19b0 kernel: --- syscall (28, FreeBSD ELF64, sys_sendmsg), rip = 0x800bcff0a, rsp = 0x7fffdf3f5528, rbp = 0x7fffdf3f5560 --- kernel: Uptime: 2h36m24s (kgdb) #0 doadump (textdump=1) at pcpu.h:230 #1 0xffffffff80bbf480 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:446 #2 0xffffffff80bbf913 in vpanic (fmt=<value optimized out>, ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:872 #3 0xffffffff80bbf703 in panic (fmt=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:799 #4 0xffffffff810a465f in trap_fatal (frame=0xfffffe00ce0b11b0, eva=152) at /usr/src/sys/amd64/amd64/trap.c:928 #5 0xffffffff810a46b9 in trap_pfault (frame=0xfffffe00ce0b11b0, usermode=0) at pcpu.h:230 #6 0xffffffff810a3cde in trap (frame=0xfffffe00ce0b11b0) at /usr/src/sys/amd64/amd64/trap.c:441 #7 0xffffffff81082905 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:232 #8 0xffffffff80e26546 in ip6_output (m0=<value optimized out>, opt=0xfffffe00ce0b14f0, ro=0x0, flags=0, im6o=<value optimized out>, ifpp=0x0, inp=0xfffff8001c58d000) at /usr/src/sys/netinet6/ip6_output.c:538 #9 0xffffffff80e44dea in udp6_send (so=<value optimized out>, flags=475582464, m=0xfffff80bd422381c, addr=<value optimized out>, control=<value optimized out>, td=0xfffffe00ce0b15e0) at /usr/src/sys/netinet6/udp6_usrreq.c:975 #10 0xffffffff80c5a12b in sosend_dgram (so=0xfffff8001c585000, addr=<value optimized out>, uio=<value optimized out>, top=<value optimized out>, control=<value optimized out>, flags=0, td=0xfffff8013db1a000) at /usr/src/sys/kern/uipc_socket.c:1391 #11 0xffffffff80c5ac60 in sosend (so=<value optimized out>, addr=<value optimized out>, uio=<value optimized out>, top=<value optimized out>, control=0xfffff8013d12d900, flags=<value optimized out>, td=0xfffff8013db1a000) at /usr/src/sys/kern/uipc_socket.c:1623 #12 0xffffffff80c61977 in kern_sendit (td=0xfffff8013db1a000, s=547, mp=<value optimized out>, flags=0, control=0xfffff8013d12d900, segflg=UIO_USERSPACE) at /usr/src/sys/kern/uipc_syscalls.c:796 #13 0xffffffff80c61cee in sendit (td=0xfffff8013db1a000, s=547, mp=0xfffffe00ce0b1830, flags=0) at /usr/src/sys/kern/uipc_syscalls.c:721 #14 0xffffffff80c61dc1 in sys_sendmsg (td=0xfffff8013db1a000, uap=0xfffff8013db1a3c0) at /usr/src/sys/kern/uipc_syscalls.c:897 #15 0xffffffff810a5079 in amd64_syscall (td=0xfffff8013db1a000, traced=0) at subr_syscall.c:135 #16 0xffffffff810831ed in fast_syscall_common () at /usr/src/sys/amd64/amd64/exception.S:504 #17 0x0000000800bcff0a in ?? ()
(In reply to Florian Smeets from comment #8) Do you still have the core dump for this to poke around a bit for me?
(In reply to Bjoern A. Zeeb from comment #9) Everything you need should be at https://tb.smeets.xyz/~flo/pr230950/
Just for reference, this seems to be an epoch related race. The improved UDP6 locking seems to have made it more easy to trigger.
It seems the problem can be triggered by just putting a Windows 10 system into the same network. Several reboots by just having a Win10 started: Sep 14 20:19:16 Andromeda syslogd: kernel boot file is /boot/kernel/kernel Sep 14 20:42:58 Andromeda syslogd: kernel boot file is /boot/kernel/kernel Sep 14 20:52:42 Andromeda syslogd: kernel boot file is /boot/kernel/kernel Sep 14 20:57:34 Andromeda syslogd: kernel boot file is /boot/kernel/kernel Sep 14 21:32:35 Andromeda syslogd: kernel boot file is /boot/kernel/kernel Sep 14 21:52:48 Andromeda syslogd: kernel boot file is /boot/kernel/kernel Sep 14 22:42:35 Andromeda syslogd: kernel boot file is /boot/kernel/kernel
I've got 2 cores as well. Let me know if you'd like access to them.
I took a look at the kernel cores from Larry. The second is more interesting, as it shows two threads concurrently running in udp6_output(): (kgdb) tid 100671 (kgdb) bt #0 __curthread () at ./machine/pcpu.h:230 #1 doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:366 #2 0xffffffff80443110 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:446 #3 0xffffffff80443563 in vpanic (fmt=<optimized out>, ap=0xfffffe00c9d86200) at /usr/src/sys/kern/kern_shutdown.c:872 #4 0xffffffff804432c3 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:799 #5 0xffffffff8072159f in trap_fatal (frame=0xfffffe00c9d863f0, eva=152) at /usr/src/sys/amd64/amd64/trap.c:928 #6 0xffffffff807215f9 in trap_pfault (frame=0xfffffe00c9d863f0, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:760 #7 0xffffffff80720c3a in trap (frame=0xfffffe00c9d863f0) at /usr/src/sys/amd64/amd64/trap.c:441 #8 <signal handler called> #9 0xffffffff80636b2b in ip6_output (m0=<optimized out>, opt=<optimized out>, ro=<optimized out>, flags=0, im6o=0x0, ifpp=0x0, inp=0xfffff800247ac3d0) at /usr/src/sys/netinet6/ip6_output.c:538 #10 0xffffffff80651730 in udp6_output (so=<optimized out>, flags_arg=<optimized out>, m=<optimized out>, addr6=<optimized out>, control=<optimized out>, td=0xffffffff80e067c0 <udbinfo>) at /usr/src/sys/netinet6/udp6_usrreq.c:975 #11 udp6_send (so=0xfffff800247ac3d0, flags=<optimized out>, m=0xfffff80127d09800, addr=<optimized out>, control=0x0, td=0xffffffff80e067c0 <udbinfo>) at /usr/src/sys/netinet6/udp6_usrreq.c:1319 #12 0xffffffff804d5376 in sosend_dgram (so=0xfffff800248116d0, addr=0xfffff80003f39300, uio=0xfffffe00c9d86950, top=0x0, control=<optimized out>, flags=0, td=0xfffff8021e161580) at /usr/src/sys/kern/uipc_socket.c:1391 #13 0xffffffff804dbee1 in kern_sendit (td=0xfffff8021e161580, s=<optimized out>, mp=0xfffffe00c9d86a30, flags=0, control=<optimized out>, segflg=UIO_USERSPACE) at /usr/src/sys/kern/uipc_syscalls.c:796 #14 0xffffffff804dc24e in sendit (td=0xfffff8021e161580, s=540, mp=0xfffffe00c9d86a30, flags=0) at /usr/src/sys/kern/uipc_syscalls.c:721 #15 0xffffffff804dc321 in sys_sendmsg (td=0xfffff8021e161580, uap=0xfffff8021e161940) at /usr/src/sys/kern/uipc_syscalls.c:897 #16 0xffffffff80721ee2 in syscallenter (td=0xfffff8021e161580) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:135 #17 amd64_syscall (td=0xfffff8021e161580, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1043 (kgdb) tid 100668 (kgdb) bt #0 cpustop_handler () at /usr/src/sys/x86/x86/mp_x86.c:1383 #1 0xffffffff80793394 in ipi_nmi_handler () at /usr/src/sys/x86/x86/mp_x86.c:1341 #2 0xffffffff807209c9 in trap (frame=0xfffffe00043a6f30) at /usr/src/sys/amd64/amd64/trap.c:206 #3 <signal handler called> #4 0xffffffff8048b415 in lock_delay (la=0xfffffe00c9d772a0) at /usr/src/sys/kern/subr_lock.c:133 #5 0xffffffff8043e438 in __rw_wlock_hard (c=0xfffff80016d33d10, v=<optimized out>, file=0xffffffff8080ae34 "/usr/src/sys/netinet6/in6.c", line=<optimized out>) at /usr/src/sys/kern/kern_rwlock.c:1009 #6 0xffffffff8043e211 in _rw_wlock_cookie (c=0xfffffe00c9d772a0, file=0xffffffff8080ae34 "/usr/src/sys/netinet6/in6.c", line=2328) at /usr/src/sys/kern/kern_rwlock.c:286 #7 0xffffffff80623f6d in in6_lltable_lookup (llt=<optimized out>, flags=8192, l3addr=<optimized out>) at /usr/src/sys/netinet6/in6.c:2328 #8 0xffffffff80641ad7 in lla_lookup (llt=0xfffffe00c9d772a0, flags=8192, l3addr=0x1c1c) at /usr/src/sys/net/if_llatbl.h:252 #9 nd6_lookup (flags=8192, ifp=<optimized out>, addr6=<optimized out>) at /usr/src/sys/netinet6/nd6.c:1211 #10 nd6_resolve (ifp=0xfffff800026c4000, is_gw=<optimized out>, m=0xfffff80127d0f600, sa_dst=<optimized out>, desten=0xfffffe00c9d77460 "\240t\327", <incomplete sequence \311>, pflags=0xfffffe00c9d77434, plle=0xfffffe00c9d77428) at /usr/src/sys/netinet6/nd6.c:2274 #11 0xffffffff8053fca5 in ether_resolve_addr (ifp=<optimized out>, m=<optimized out>, dst=0xfffff80003c16820, ro=<optimized out>, plle=0xfffff37c, phdr=<optimized out>, pflags=<optimized out>) at /usr/src/sys/net/if_ethersubr.c:239 #12 ether_output (ifp=<optimized out>, m=0xfffff80127d0f600, dst=0xfffff80003c16820, ro=<optimized out>) at /usr/src/sys/net/if_ethersubr.c:345 #13 0xffffffff806379b1 in ip6_output (m0=<optimized out>, opt=<optimized out>, ro=<optimized out>, flags=0, im6o=0x0, ifpp=0x0, inp=0xfffff800247ac3d0) at /usr/src/sys/netinet6/ip6_output.c:942 #14 0xffffffff80651730 in udp6_output (so=<optimized out>, flags_arg=<optimized out>, m=<optimized out>, addr6=<optimized out>, control=<optimized out>, td=0xffffffff80e067c0 <udbinfo>) at /usr/src/sys/netinet6/udp6_usrreq.c:975 #15 udp6_send (so=0xfffff800247ac3d0, flags=<optimized out>, m=0xfffff80127d0f600, addr=<optimized out>, control=0x0, td=0xffffffff80e067c0 <udbinfo>) at /usr/src/sys/netinet6/udp6_usrreq.c:1319 #16 0xffffffff804d5376 in sosend_dgram (so=0xfffff800248116d0, addr=0xfffff80127037480, uio=0xfffffe00c9d77950, top=0x0, control=<optimized out>, flags=0, td=0xfffff80127791580) at /usr/src/sys/kern/uipc_socket.c:1391 #17 0xffffffff804dbee1 in kern_sendit (td=0xfffff80127791580, s=<optimized out>, mp=0xfffffe00c9d77a30, flags=0, control=<optimized out>, segflg=UIO_USERSPACE) at /usr/src/sys/kern/uipc_syscalls.c:796 #18 0xffffffff804dc24e in sendit (td=0xfffff80127791580, s=540, mp=0xfffffe00c9d77a30, flags=0) at /usr/src/sys/kern/uipc_syscalls.c:721 #19 0xffffffff804dc321 in sys_sendmsg (td=0xfffff80127791580, uap=0xfffff80127791940) at /usr/src/sys/kern/uipc_syscalls.c:897 #20 0xffffffff80721ee2 in syscallenter (td=0xfffff80127791580) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:135 Thread 100671 crashed because ro->ro_lle changed to NULL after the check: (kgdb) frame 9 #9 0xffffffff80636b2b in ip6_output (m0=<optimized out>, opt=<optimized out>, ro=<optimized out>, flags=0, im6o=0x0, ifpp=0x0, inp=0xfffff800247ac3d0) at /usr/src/sys/netinet6/ip6_output.c:538 538 LLE_FREE(ro->ro_lle); /* zeros ro_lle */ (kgdb) p $27 = <optimized out> (kgdb) disas Dump of assembler code for function ip6_output: 0xffffffff80635c40 <+0>: push %rbp 0xffffffff80635c41 <+1>: mov %rsp,%rbp 0xffffffff80635c44 <+4>: push %r15 0xffffffff80635c46 <+6>: push %r14 0xffffffff80635c48 <+8>: push %r13 0xffffffff80635c4a <+10>: push %r12 0xffffffff80635c4c <+12>: push %rbx 0xffffffff80635c4d <+13>: sub $0x238,%rsp 0xffffffff80635c54 <+20>: mov %r9,-0x1f0(%rbp) 0xffffffff80635c5b <+27>: mov %r8,-0xe8(%rbp) 0xffffffff80635c62 <+34>: mov %ecx,%r12d 0xffffffff80635c65 <+37>: mov %rdx,-0x40(%rbp) <-- ro is saved at %rbp - 0x40 ... (kgdb) p (char *)$rbp - 0x40 $28 = 0xfffffe00c9d866d0 "X\305z$" (kgdb) p *(struct route **)0xfffffe00c9d866d0 $29 = (struct route *) 0xfffff800247ac558 (kgdb) p *$29 $30 = { ro_rt = 0xfffff800249e7000, ro_lle = 0x0, ro_prepend = 0x0, ro_plen = 0, ro_flags = 256, ro_mtu = 0, spare = 0, ro_dst = { sa_len = 28 '\034', sa_family = 28 '\034', sa_data = "\000\000\000\000\000\000& \000\000\r\340\377\377" } } Meanwhile, the other thread was using the same route and executed the following code in ether_output(): 300 } else if (!(m->m_flags & (M_BCAST | M_MCAST))) { 301 if ((ro->ro_flags & RT_LLE_CACHE) != 0) { 302 lle = ro->ro_lle; 303 if (lle != NULL && 304 (lle->la_flags & LLE_VALID) == 0) { 305 LLE_FREE(lle); 306 lle = NULL; /* redundant */ 307 ro->ro_lle = NULL; 308 } 309 if (lle == NULL) { 310 /* if we lookup, keep cache */ 311 addref = 1; I don't see any locking which prevents the double free.
Before r338257 it looks like we held the inp write lock across the ip6_output() call, which should in principle prevent the double free.
(In reply to Mark Johnston from comment #15) Look at the v4 version; we haven't done this there in over 7 years either: https://svnweb.freebsd.org/base/head/sys/netinet/udp_usrreq.c?r1=222487&r2=222488& https://svnweb.freebsd.org/base/head/sys/netinet/udp_usrreq.c?annotate=337018#l1540 And we are doing the same kind of invalidations in the legacy output path: https://svnweb.freebsd.org/base/head/sys/netinet/ip_output.c?annotate=328271&pathrev=328271#l285 Why are people stumbling over this on v6 more? I kept suspecting that it has to do with RAs, one of the reasons I could not replicate it in a "noise-free" testbed. The real problem is that if there are two consumers of the same lle and the lle gets freed, then the refcount on the lle isn't right, right? And given they are on the same inp, the problem is that we are freeing them off the inp while we are holding a rlock only; in turn this would mean we need wlocks for both ip_output and ip6_output calls all the time to prevent this? I think this is partly why the original route caching was passing and updated route all the way back up the stack to a save place where we could alter the inp and only worked on a local copy compared to doing the in-place changes? I am adding karels to Cc:
I will have some time to look at this later, but one quick comment: the LLE code was doing something tricky with "sharing" a reference to the LLE entry so it didn't have to lock when doing output. That was obviously making some assumptions that might not be true now. I'll look more later.
(In reply to Mark Johnston from comment #14) Yes, clearly the two threads were both manipulating the route cache in the same inpcb. I'm a little surprised to see two UDP6 sends on the same socket at the same time, but something is obviously doing that. Clearly we need a write lock if we're going to modify the cache. However, given this usage, it probably makes more sense not to cache; the destinations may well be different anyway. I'll think about whether it makes sense to cache for connected UDP sockets; they may be able to use a read lock if the cache is valid. In the meantime, I suggest the attached workaround to disable UDP6 route caching for now.
Created attachment 197157 [details] patch to disable UDP6 route caching temporarily
(In reply to Mike Karels from comment #19) Can you please also look at v4, as I pointed out the same should apply there as well.
Just adding Ryan's comment from https://lists.freebsd.org/pipermail/freebsd-net/2018-September/051563.html which ae@ has pointed out on current.
If it is epoch related race, then, may be, I have same problem with IPv4: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=231428
Also, given my previous suggested patch to try, I do have a core where the inp is marked INP_FREED. That means in_pcbfree() was called which in turn means that (likely) udp6_detach() was called. Hence the moving of RO_INVALIDATE_CACHE() to in_pcbfree_deferred(). Now in_pcbfree() does assert the INP_WLOCK_ASSERT() which still is an rw_lock and should not "race" with the output path even if we are holding an rlock only there. However I also don't see exactly how this should happen given we were in the middle of a kern_sendit() syscall. I guess solving one problem after the other might help. Can people try the suggested change to disable the route caching in the udp/udp6_output() functions by passing NULL instead of passing &inp->inp_route / &inp->inp_route6 in and report back (in a day unless it crashes earlier still)?
I have the "#if 0" patch from Mike in use. Still some panic: Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 13 fault virtual address = 0x98 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff8068cbf2 stack pointer = 0x28:0xfffffe0128caa510 frame pointer = 0x28:0xfffffe0128caa760 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 1658 (isc-worker0003) trap number = 12 panic: page fault cpuid = 5 time = 1535835179 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0128caa1c0 vpanic() at vpanic+0x1a3/frame 0xfffffe0128caa220 panic() at panic+0x43/frame 0xfffffe0128caa280 trap_fatal() at trap_fatal+0x35f/frame 0xfffffe0128caa2d0 trap_pfault() at trap_pfault+0x49/frame 0xfffffe0128caa330 trap() at trap+0x2ba/frame 0xfffffe0128caa440 calltrap() at calltrap+0x8/frame 0xfffffe0128caa440 --- trap 0xc, rip = 0xffffffff8068cbf2, rsp = 0xfffffe0128caa510, rbp = 0xfffffe0128caa760 --- ip6_output() at ip6_output+0xf82/frame 0xfffffe0128caa760 udp6_send() at udp6_send+0x702/frame 0xfffffe0128caa920 sosend_dgram() at sosend_dgram+0x346/frame 0xfffffe0128caa980 kern_sendit() at kern_sendit+0x170/frame 0xfffffe0128caaa10 sendit() at sendit+0x19e/frame 0xfffffe0128caaa60 sys_sendmsg() at sys_sendmsg+0x61/frame 0xfffffe0128caaac0 amd64_syscall() at amd64_syscall+0x254/frame 0xfffffe0128caabf0 fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0128caabf0 --- syscall (28, FreeBSD ELF64, sys_sendmsg), rip = 0x8015adf0a, rsp = 0x7fffdf9f7218, rbp = 0x7fffdf9f7250 --- Uptime: 22h37m4s Dumping 13174 out of 61352 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% warning: kld_current_sos: Can't read filename: Unknown error: -1 inferior.c:311: internal-error: struct inferior *find_inferior_pid(int): Assertion `pid != 0' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) [answered Y; input not from terminal] Can't get more out of gdk 8.11...
I'll post two patches to apply in an hour or two. One to fix one problem, and one to add assertion to catch the other problem in a more defined state.
(In reply to Bjoern A. Zeeb from comment #20) v4 is the same, so if people are getting crashes there, the same workaround should work. The eventual fix should be the same for both.
(In reply to Alexander Leidinger from comment #24) Hmm, not too much to go on here. If kgdb is not erroring out immediately, could you send the output of these (should work without the core file): x/i 0xffffffff8068cbf2 x/i p6_output+0xf82 Thanks.
(In reply to Mike Karels from comment #27) x/i 0xffffffff8068cbf2: ip6_output+3922: pop %rbp x/i ip6_output+0xf82: cmpl $0x1,0x98(%rdi) I wonder what the problem with gdb 8.11 is. Or maybe I have an issue with my cores. If I just say "kgdb /boot/kernel/kernel" and then inside gdb "core vmcore.2" it tells me no core file, it can't recognize the file format.
Created attachment 197208 [details] inp wlocking for route cache in udp6_output Please try this one; sorry it took a bit longer; I paniced my test system in other ways while trying to test this one.
(In reply to Bjoern A. Zeeb from comment #29) I applied the patch to my system that would panic every 1-2 hours without the patch. I will report back tomorrow.
https://reviews.freebsd.org/D17230 for everyone who wants to leave comments there.
(In reply to Bjoern A. Zeeb from comment #29) % uptime 6:18PM up 20:38, 4 users, load averages: 0.80, 0.39, 0.25 The patch seems to prevent/fix the panic.
A commit references this bug: Author: bz Date: Wed Sep 19 18:49:38 UTC 2018 New revision: 338804 URL: https://svnweb.freebsd.org/changeset/base/338804 Log: Update udp6_output() inp locking to avoid concurrency issues with route cache updates. Bring over locking changes applied to udp_output() for the route cache in r297225 and fixed in r306559 which achieve multiple things: (1) acquire an exclusive inp lock earlier depending on the expected conditions; we add a comment explaining this in udp6, (2) having acquired the exclusive lock earlier eliminates a slight possible chance for a race condition which was present in v4 for multiple years as well and is now gone, and (3) only pass the inp_route6 to ip6_output() if we are holding an exclusive inp lock, so that possible route cache updates in case of routing table generation number changes can happen safely. In addition this change (as the legacy IP counterpart) decomposes the tracking of inp and pcbinfo lock and adds extra assertions, that the two together are acquired correctly. PR: 230950 Reviewed by: karels, markj Approved by: re (gjb) Pointyhat to: bz (for completely missing this bit) Differential Revision: https://reviews.freebsd.org/D17230 Changes: head/sys/netinet6/udp6_usrreq.c