Bug 230950 - ip6_output/ro panics after r338306/UDP6 changes
Summary: ip6_output/ro panics after r338306/UDP6 changes
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: amd64 Any
: --- Affects Many People
Assignee: Bjoern A. Zeeb
URL:
Keywords: regression
: 231067 (view as bug list)
Depends on:
Blocks:
 
Reported: 2018-08-27 14:43 UTC by Sean Bruno
Modified: 2019-09-23 16:48 UTC (History)
14 users (show)

See Also:


Attachments
patch to disable UDP6 route caching temporarily (499 bytes, patch)
2018-09-17 02:02 UTC, Mike Karels
no flags Details | Diff
inp wlocking for route cache in udp6_output (3.80 KB, patch)
2018-09-18 16:36 UTC, Bjoern A. Zeeb
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sean Bruno freebsd_committer freebsd_triage 2018-08-27 14:43:30 UTC
I'm not 100% sure that these panics are from r338306.  They are from an ftp mirror and pkg mirror, both induced from unbound doing its ipv6 udp thing.

-----------
Fatal trap 12: page fault while in kernel mode
cpuid = 4; apic id = 32
fault virtual address   = 0x68
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80a5380b
stack pointer           = 0x28:0xfffffe0141068500
frame pointer           = 0x28:0xfffffe0141068750
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 38582 (unbound)
[ thread pid 38582 tid 100859 ]
Stopped at      ip6_output+0x18db:      movq    ll+0x47(%rcx),%rcx
db> bt
Tracing pid 38582 tid 100859 td 0xfffff8010988b000
ip6_output() at ip6_output+0x18db/frame 0xfffffe0141068750
udp6_send() at udp6_send+0x7f8/frame 0xfffffe0141068910
sosend_dgram() at sosend_dgram+0x33b/frame 0xfffffe0141068980
kern_sendit() at kern_sendit+0x237/frame 0xfffffe0141068a20
sendit() at sendit+0x19e/frame 0xfffffe0141068a70
sys_sendto() at sys_sendto+0x4d/frame 0xfffffe0141068ac0
amd64_syscall() at amd64_syscall+0x369/frame 0xfffffe0141068bf0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0141068bf0
--- syscall (133, FreeBSD ELF64, sys_sendto), rip = 0x800836efa, rsp = 0x7fffdf9fa988, rbp = 0x7fffdf9fa9d0 ---

--------

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address   = 0x7c
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80a4d344
stack pointer           = 0x28:0xfffffe0094a36490
frame pointer           = 0x28:0xfffffe0094a364d0
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 1247 (unbound)
[ thread pid 1247 tid 100907 ]
Stopped at      selectroute+0x194:      movl    ll+0x5b(%rdi),%eax
db> 
db> bt
Tracing pid 1247 tid 100907 td 0xfffff8000d131580
selectroute() at selectroute+0x194/frame 0xfffffe0094a364d0
in6_selectroute_fib() at in6_selectroute_fib+0xf/frame 0xfffffe0094a364f0
ip6_output() at ip6_output+0xfc4/frame 0xfffffe0094a36750
udp6_send() at udp6_send+0x7f8/frame 0xfffffe0094a36910
sosend_dgram() at sosend_dgram+0x33b/frame 0xfffffe0094a36980
kern_sendit() at kern_sendit+0x237/frame 0xfffffe0094a36a20
sendit() at sendit+0x19e/frame 0xfffffe0094a36a70
sys_sendto() at sys_sendto+0x4d/frame 0xfffffe0094a36ac0
amd64_syscall() at amd64_syscall+0x369/frame 0xfffffe0094a36bf0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0094a36bf0
--- syscall (133, FreeBSD ELF64, sys_sendto), rip = 0x800836efa, rsp = 0x7fffdfdfcce8, rbp = 0x7fffdfdfcd30 ---
Comment 1 Sean Bruno freebsd_committer freebsd_triage 2018-08-27 14:45:47 UTC
This is from:

FreeBSD pkg-mirror0.isc.freebsd.org 12.0-ALPHA3 FreeBSD 12.0-ALPHA3 #0 r338330: Sun Aug 26 19:06:44 UTC 2018     peter@build-12.freebsd.org:/usr/obj/usr/src/sys/CLUSTER12  amd64
Comment 2 Bjoern A. Zeeb freebsd_committer freebsd_triage 2018-08-27 18:03:39 UTC
I'll have a look.
Comment 3 Bjoern A. Zeeb freebsd_committer freebsd_triage 2018-09-02 16:43:02 UTC
*** Bug 231067 has been marked as a duplicate of this bug. ***
Comment 4 Bjoern A. Zeeb freebsd_committer freebsd_triage 2018-09-02 16:47:59 UTC
I am working on patches for this.
Sorry life got in the way a bit.

Also I had to dig quite a bit and found a few "cosmetical and functional" bits where people have simply ignored netinet6 again the last two years, so trying to close that gap along.

I hope to have things for test to all of you within 24 hours.
Comment 5 Bjoern A. Zeeb freebsd_committer freebsd_triage 2018-09-03 16:44:46 UTC
In case any of you has a core file from a previous panic, could you please go and see if the so_pcb from the so passed to udp6_output() looks valid, and especially if you could tell me the %#x of inp_flags2 that would be appreciated.


Also if you want to test this one, I'd be curious if it holds up.

In an isolated lab setup, I have so far not been able to reproduce the race/panic myself.

diff --git a/sys/netinet/in_pcb.c b/sys/netinet/in_pcb.c
index 98838fa041a..f24ad42f9f4 100644
--- a/sys/netinet/in_pcb.c
+++ b/sys/netinet/in_pcb.c
@@ -1596,6 +1596,7 @@ in_pcbfree_deferred(epoch_context_t ctx)
 #endif
        if (inp->inp_options)
                (void)m_free(inp->inp_options);
+       RO_INVALIDATE_CACHE(&inp->inp_route);
        inp->inp_vflag = 0;
        crfree(inp->inp_cred);
 #ifdef MAC
@@ -1644,7 +1645,6 @@ in_pcbfree(struct inpcb *inp)
        INP_LIST_WLOCK(pcbinfo);
        in_pcbremlists(inp);
        INP_LIST_WUNLOCK(pcbinfo);
-       RO_INVALIDATE_CACHE(&inp->inp_route);
        /* mark as destruction in progress */
        inp->inp_flags2 |= INP_FREED;
        INP_WUNLOCK(inp);
Comment 6 Bjoern A. Zeeb freebsd_committer freebsd_triage 2018-09-03 22:23:24 UTC
(In reply to Bjoern A. Zeeb from comment #5)

Ignore the diff;  no need to try!  This has other implications, with epoch(9).
Comment 7 Bjoern A. Zeeb freebsd_committer freebsd_triage 2018-09-03 23:32:19 UTC
Ok, 2nd try; I still cannot reproduce the ip6_output() panics.  If you've got a few spare minutes try this one; it at least survives the tests I can do at the moment.

diff --git a/sys/netinet/in_pcb.c b/sys/netinet/in_pcb.c
index 98838fa041a..95ec83dd4d4 100644
--- a/sys/netinet/in_pcb.c
+++ b/sys/netinet/in_pcb.c
@@ -1596,6 +1596,9 @@ in_pcbfree_deferred(epoch_context_t ctx)
 #endif
        if (inp->inp_options)
                (void)m_free(inp->inp_options);
+       CURVNET_SET(inp->inp_vnet);
+       RO_INVALIDATE_CACHE(&inp->inp_route);
+       CURVNET_RESTORE();
        inp->inp_vflag = 0;
        crfree(inp->inp_cred);
 #ifdef MAC
@@ -1644,7 +1647,6 @@ in_pcbfree(struct inpcb *inp)
        INP_LIST_WLOCK(pcbinfo);
        in_pcbremlists(inp);
        INP_LIST_WUNLOCK(pcbinfo);
-       RO_INVALIDATE_CACHE(&inp->inp_route);
        /* mark as destruction in progress */
        inp->inp_flags2 |= INP_FREED;
        INP_WUNLOCK(inp);
Comment 8 Florian Smeets freebsd_committer freebsd_triage 2018-09-04 07:40:57 UTC
Doesn't seem to help. This is r338450M with the patch in Comment 7 applied.

Kernel was activated at 01:58

boot time                                  Tue Sep  4 08:46
boot time                                  Tue Sep  4 06:08
boot time                                  Tue Sep  4 05:21
boot time                                  Tue Sep  4 03:04
boot time                                  Tue Sep  4 01:58

kernel: Fatal trap 12: page fault while in kernel mode
kernel: cpuid = 7; apic id = 07
kernel: fault virtual address       = 0x98
kernel: fault code          = supervisor read data, page not present
kernel: instruction pointer = 0x20:0xffffffff80e26546
kernel: stack pointer               = 0x28:0xfffffe00ce0b1270
kernel: frame pointer               = 0x28:0xfffffe00ce0b14d0
kernel: code segment                = base rx0, limit 0xfffff, type 0x1b
kernel:                     = DPL 0, pres 1, long 1, def32 0, gran 1
kernel: processor eflags    = interrupt enabled, resume, IOPL = 0
kernel: current process             = 12303 (isc-worker0006)
kernel: trap number         = 12
kernel: panic: page fault
kernel: cpuid = 7
kernel: time = 1536043483
kernel: KDB: stack backtrace:
kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00ce0b0f20
kernel: vpanic() at vpanic+0x1a3/frame 0xfffffe00ce0b0f80
kernel: panic() at panic+0x43/frame 0xfffffe00ce0b0fe0
kernel: trap_fatal() at trap_fatal+0x35f/frame 0xfffffe00ce0b1030
kernel: trap_pfault() at trap_pfault+0x49/frame 0xfffffe00ce0b1090
kernel: trap() at trap+0x29e/frame 0xfffffe00ce0b11a0
kernel: calltrap() at calltrap+0x8/frame 0xfffffe00ce0b11a0
kernel: --- trap 0xc, rip = 0xffffffff80e26546, rsp = 0xfffffe00ce0b1270, rbp = 0xfffffe00ce0b14d0 ---
kernel: ip6_output() at ip6_output+0xef6/frame 0xfffffe00ce0b14d0
kernel: udp6_send() at udp6_send+0x86a/frame 0xfffffe00ce0b1690
kernel: sosend_dgram() at sosend_dgram+0x33b/frame 0xfffffe00ce0b1700
kernel: sosend() at sosend+0x50/frame 0xfffffe00ce0b1730
kernel: kern_sendit() at kern_sendit+0x237/frame 0xfffffe00ce0b17d0
kernel: sendit() at sendit+0x19e/frame 0xfffffe00ce0b1820
kernel: sys_sendmsg() at sys_sendmsg+0x61/frame 0xfffffe00ce0b1880
kernel: amd64_syscall() at amd64_syscall+0x369/frame 0xfffffe00ce0b19b0
kernel: fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe00ce0b19b0
kernel: --- syscall (28, FreeBSD ELF64, sys_sendmsg), rip = 0x800bcff0a, rsp = 0x7fffdf3f5528, rbp = 0x7fffdf3f5560 ---
kernel: Uptime: 2h36m24s


(kgdb) #0  doadump (textdump=1) at pcpu.h:230
#1  0xffffffff80bbf480 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:446
#2  0xffffffff80bbf913 in vpanic (fmt=<value optimized out>,
    ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:872
#3  0xffffffff80bbf703 in panic (fmt=<value optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:799
#4  0xffffffff810a465f in trap_fatal (frame=0xfffffe00ce0b11b0, eva=152)
    at /usr/src/sys/amd64/amd64/trap.c:928
#5  0xffffffff810a46b9 in trap_pfault (frame=0xfffffe00ce0b11b0, usermode=0)
    at pcpu.h:230
#6  0xffffffff810a3cde in trap (frame=0xfffffe00ce0b11b0)
    at /usr/src/sys/amd64/amd64/trap.c:441
#7  0xffffffff81082905 in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:232
#8  0xffffffff80e26546 in ip6_output (m0=<value optimized out>,
    opt=0xfffffe00ce0b14f0, ro=0x0, flags=0, im6o=<value optimized out>,
    ifpp=0x0, inp=0xfffff8001c58d000) at /usr/src/sys/netinet6/ip6_output.c:538
#9  0xffffffff80e44dea in udp6_send (so=<value optimized out>,
    flags=475582464, m=0xfffff80bd422381c, addr=<value optimized out>,
    control=<value optimized out>, td=0xfffffe00ce0b15e0)
    at /usr/src/sys/netinet6/udp6_usrreq.c:975
#10 0xffffffff80c5a12b in sosend_dgram (so=0xfffff8001c585000,
    addr=<value optimized out>, uio=<value optimized out>,
    top=<value optimized out>, control=<value optimized out>, flags=0,
    td=0xfffff8013db1a000) at /usr/src/sys/kern/uipc_socket.c:1391
#11 0xffffffff80c5ac60 in sosend (so=<value optimized out>,
    addr=<value optimized out>, uio=<value optimized out>,
    top=<value optimized out>, control=0xfffff8013d12d900,
    flags=<value optimized out>, td=0xfffff8013db1a000)
    at /usr/src/sys/kern/uipc_socket.c:1623
#12 0xffffffff80c61977 in kern_sendit (td=0xfffff8013db1a000, s=547,
    mp=<value optimized out>, flags=0, control=0xfffff8013d12d900,
    segflg=UIO_USERSPACE) at /usr/src/sys/kern/uipc_syscalls.c:796
#13 0xffffffff80c61cee in sendit (td=0xfffff8013db1a000, s=547,
    mp=0xfffffe00ce0b1830, flags=0) at /usr/src/sys/kern/uipc_syscalls.c:721
#14 0xffffffff80c61dc1 in sys_sendmsg (td=0xfffff8013db1a000,
    uap=0xfffff8013db1a3c0) at /usr/src/sys/kern/uipc_syscalls.c:897
#15 0xffffffff810a5079 in amd64_syscall (td=0xfffff8013db1a000, traced=0)
    at subr_syscall.c:135
#16 0xffffffff810831ed in fast_syscall_common ()
    at /usr/src/sys/amd64/amd64/exception.S:504
#17 0x0000000800bcff0a in ?? ()
Comment 9 Bjoern A. Zeeb freebsd_committer freebsd_triage 2018-09-04 09:06:40 UTC
(In reply to Florian Smeets from comment #8)

Do you still have the core dump for this to poke around a bit for me?
Comment 10 Florian Smeets freebsd_committer freebsd_triage 2018-09-04 18:28:11 UTC
(In reply to Bjoern A. Zeeb from comment #9)

Everything you need should be at https://tb.smeets.xyz/~flo/pr230950/
Comment 11 Bjoern A. Zeeb freebsd_committer freebsd_triage 2018-09-09 21:03:58 UTC
Just for reference, this seems to be an epoch related race.
The improved UDP6 locking seems to have made it more easy to trigger.
Comment 12 Alexander Leidinger freebsd_committer freebsd_triage 2018-09-15 09:26:36 UTC
It seems the problem can be triggered by just putting a Windows 10 system into the same network. Several reboots by just having a Win10 started:
Sep 14 20:19:16 Andromeda syslogd: kernel boot file is /boot/kernel/kernel
Sep 14 20:42:58 Andromeda syslogd: kernel boot file is /boot/kernel/kernel
Sep 14 20:52:42 Andromeda syslogd: kernel boot file is /boot/kernel/kernel
Sep 14 20:57:34 Andromeda syslogd: kernel boot file is /boot/kernel/kernel
Sep 14 21:32:35 Andromeda syslogd: kernel boot file is /boot/kernel/kernel
Sep 14 21:52:48 Andromeda syslogd: kernel boot file is /boot/kernel/kernel
Sep 14 22:42:35 Andromeda syslogd: kernel boot file is /boot/kernel/kernel
Comment 13 Larry Rosenman freebsd_committer freebsd_triage 2018-09-16 00:28:15 UTC
I've got 2 cores as well.  Let me know if you'd like access to them.
Comment 14 Mark Johnston freebsd_committer freebsd_triage 2018-09-16 19:48:09 UTC
I took a look at the kernel cores from Larry.  The second is more interesting, as it shows two threads concurrently running in udp6_output():

(kgdb) tid 100671
(kgdb) bt
#0  __curthread () at ./machine/pcpu.h:230
#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:366
#2  0xffffffff80443110 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:446                                                                                   
#3  0xffffffff80443563 in vpanic (fmt=<optimized out>, ap=0xfffffe00c9d86200) at /usr/src/sys/kern/kern_shutdown.c:872                                                       
#4  0xffffffff804432c3 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:799                                                                                 
#5  0xffffffff8072159f in trap_fatal (frame=0xfffffe00c9d863f0, eva=152) at /usr/src/sys/amd64/amd64/trap.c:928                                                              
#6  0xffffffff807215f9 in trap_pfault (frame=0xfffffe00c9d863f0, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:760                                                          
#7  0xffffffff80720c3a in trap (frame=0xfffffe00c9d863f0) at /usr/src/sys/amd64/amd64/trap.c:441                                                                             
#8  <signal handler called>
#9  0xffffffff80636b2b in ip6_output (m0=<optimized out>, opt=<optimized out>, ro=<optimized out>, flags=0, im6o=0x0, ifpp=0x0, inp=0xfffff800247ac3d0)                      
    at /usr/src/sys/netinet6/ip6_output.c:538
#10 0xffffffff80651730 in udp6_output (so=<optimized out>, flags_arg=<optimized out>, m=<optimized out>, addr6=<optimized out>, control=<optimized out>,                     
    td=0xffffffff80e067c0 <udbinfo>) at /usr/src/sys/netinet6/udp6_usrreq.c:975
#11 udp6_send (so=0xfffff800247ac3d0, flags=<optimized out>, m=0xfffff80127d09800, addr=<optimized out>, control=0x0, td=0xffffffff80e067c0 <udbinfo>)                       
    at /usr/src/sys/netinet6/udp6_usrreq.c:1319
#12 0xffffffff804d5376 in sosend_dgram (so=0xfffff800248116d0, addr=0xfffff80003f39300, uio=0xfffffe00c9d86950, top=0x0, control=<optimized out>, flags=0,                   
    td=0xfffff8021e161580) at /usr/src/sys/kern/uipc_socket.c:1391
#13 0xffffffff804dbee1 in kern_sendit (td=0xfffff8021e161580, s=<optimized out>, mp=0xfffffe00c9d86a30, flags=0, control=<optimized out>, segflg=UIO_USERSPACE)              
    at /usr/src/sys/kern/uipc_syscalls.c:796
#14 0xffffffff804dc24e in sendit (td=0xfffff8021e161580, s=540, mp=0xfffffe00c9d86a30, flags=0) at /usr/src/sys/kern/uipc_syscalls.c:721                                     
#15 0xffffffff804dc321 in sys_sendmsg (td=0xfffff8021e161580, uap=0xfffff8021e161940) at /usr/src/sys/kern/uipc_syscalls.c:897                                               
#16 0xffffffff80721ee2 in syscallenter (td=0xfffff8021e161580) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:135                                                     
#17 amd64_syscall (td=0xfffff8021e161580, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1043                                                                                  

(kgdb) tid 100668
(kgdb) bt
#0  cpustop_handler () at /usr/src/sys/x86/x86/mp_x86.c:1383
#1  0xffffffff80793394 in ipi_nmi_handler () at /usr/src/sys/x86/x86/mp_x86.c:1341
#2  0xffffffff807209c9 in trap (frame=0xfffffe00043a6f30) at /usr/src/sys/amd64/amd64/trap.c:206
#3  <signal handler called>
#4  0xffffffff8048b415 in lock_delay (la=0xfffffe00c9d772a0) at /usr/src/sys/kern/subr_lock.c:133
#5  0xffffffff8043e438 in __rw_wlock_hard (c=0xfffff80016d33d10, v=<optimized out>, file=0xffffffff8080ae34 "/usr/src/sys/netinet6/in6.c", line=<optimized out>)
    at /usr/src/sys/kern/kern_rwlock.c:1009
#6  0xffffffff8043e211 in _rw_wlock_cookie (c=0xfffffe00c9d772a0, file=0xffffffff8080ae34 "/usr/src/sys/netinet6/in6.c", line=2328) at /usr/src/sys/kern/kern_rwlock.c:286
#7  0xffffffff80623f6d in in6_lltable_lookup (llt=<optimized out>, flags=8192, l3addr=<optimized out>) at /usr/src/sys/netinet6/in6.c:2328
#8  0xffffffff80641ad7 in lla_lookup (llt=0xfffffe00c9d772a0, flags=8192, l3addr=0x1c1c) at /usr/src/sys/net/if_llatbl.h:252
#9  nd6_lookup (flags=8192, ifp=<optimized out>, addr6=<optimized out>) at /usr/src/sys/netinet6/nd6.c:1211
#10 nd6_resolve (ifp=0xfffff800026c4000, is_gw=<optimized out>, m=0xfffff80127d0f600, sa_dst=<optimized out>, 
    desten=0xfffffe00c9d77460 "\240t\327", <incomplete sequence \311>, pflags=0xfffffe00c9d77434, plle=0xfffffe00c9d77428) at /usr/src/sys/netinet6/nd6.c:2274
#11 0xffffffff8053fca5 in ether_resolve_addr (ifp=<optimized out>, m=<optimized out>, dst=0xfffff80003c16820, ro=<optimized out>, plle=0xfffff37c, phdr=<optimized out>, 
    pflags=<optimized out>) at /usr/src/sys/net/if_ethersubr.c:239
#12 ether_output (ifp=<optimized out>, m=0xfffff80127d0f600, dst=0xfffff80003c16820, ro=<optimized out>) at /usr/src/sys/net/if_ethersubr.c:345
#13 0xffffffff806379b1 in ip6_output (m0=<optimized out>, opt=<optimized out>, ro=<optimized out>, flags=0, im6o=0x0, ifpp=0x0, inp=0xfffff800247ac3d0)
    at /usr/src/sys/netinet6/ip6_output.c:942
#14 0xffffffff80651730 in udp6_output (so=<optimized out>, flags_arg=<optimized out>, m=<optimized out>, addr6=<optimized out>, control=<optimized out>, 
    td=0xffffffff80e067c0 <udbinfo>) at /usr/src/sys/netinet6/udp6_usrreq.c:975
#15 udp6_send (so=0xfffff800247ac3d0, flags=<optimized out>, m=0xfffff80127d0f600, addr=<optimized out>, control=0x0, td=0xffffffff80e067c0 <udbinfo>)
    at /usr/src/sys/netinet6/udp6_usrreq.c:1319
#16 0xffffffff804d5376 in sosend_dgram (so=0xfffff800248116d0, addr=0xfffff80127037480, uio=0xfffffe00c9d77950, top=0x0, control=<optimized out>, flags=0, 
    td=0xfffff80127791580) at /usr/src/sys/kern/uipc_socket.c:1391
#17 0xffffffff804dbee1 in kern_sendit (td=0xfffff80127791580, s=<optimized out>, mp=0xfffffe00c9d77a30, flags=0, control=<optimized out>, segflg=UIO_USERSPACE)
    at /usr/src/sys/kern/uipc_syscalls.c:796
#18 0xffffffff804dc24e in sendit (td=0xfffff80127791580, s=540, mp=0xfffffe00c9d77a30, flags=0) at /usr/src/sys/kern/uipc_syscalls.c:721
#19 0xffffffff804dc321 in sys_sendmsg (td=0xfffff80127791580, uap=0xfffff80127791940) at /usr/src/sys/kern/uipc_syscalls.c:897
#20 0xffffffff80721ee2 in syscallenter (td=0xfffff80127791580) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:135

Thread 100671 crashed because ro->ro_lle changed to NULL after the check:

(kgdb) frame 9
#9  0xffffffff80636b2b in ip6_output (m0=<optimized out>, opt=<optimized out>, ro=<optimized out>, flags=0, im6o=0x0, ifpp=0x0, inp=0xfffff800247ac3d0)
    at /usr/src/sys/netinet6/ip6_output.c:538
538                             LLE_FREE(ro->ro_lle);   /* zeros ro_lle */
(kgdb) p
$27 = <optimized out>
(kgdb) disas                                                
Dump of assembler code for function ip6_output:                                                                                                        
   0xffffffff80635c40 <+0>:     push   %rbp                                                     
   0xffffffff80635c41 <+1>:     mov    %rsp,%rbp                                                                                                        
   0xffffffff80635c44 <+4>:     push   %r15                                                      
   0xffffffff80635c46 <+6>:     push   %r14                                                                                                                     
   0xffffffff80635c48 <+8>:     push   %r13    
   0xffffffff80635c4a <+10>:    push   %r12                                                                                                                               
   0xffffffff80635c4c <+12>:    push   %rbx                                                                                               
   0xffffffff80635c4d <+13>:    sub    $0x238,%rsp                                                                                                             
   0xffffffff80635c54 <+20>:    mov    %r9,-0x1f0(%rbp)                                                    
   0xffffffff80635c5b <+27>:    mov    %r8,-0xe8(%rbp)                                                                                  
   0xffffffff80635c62 <+34>:    mov    %ecx,%r12d                                                                                                             
   0xffffffff80635c65 <+37>:    mov    %rdx,-0x40(%rbp) <-- ro is saved at %rbp - 0x40
...
(kgdb) p (char *)$rbp - 0x40
$28 = 0xfffffe00c9d866d0 "X\305z$"
(kgdb) p *(struct route **)0xfffffe00c9d866d0
$29 = (struct route *) 0xfffff800247ac558
(kgdb) p *$29
$30 = {
  ro_rt = 0xfffff800249e7000,
  ro_lle = 0x0,
  ro_prepend = 0x0,
  ro_plen = 0,
  ro_flags = 256,
  ro_mtu = 0,
  spare = 0,
  ro_dst = {
    sa_len = 28 '\034',
    sa_family = 28 '\034',
    sa_data = "\000\000\000\000\000\000& \000\000\r\340\377\377"
  }
}

Meanwhile, the other thread was using the same route and
executed the following code in ether_output():

 300                 } else if (!(m->m_flags & (M_BCAST | M_MCAST))) {                                                                                                        
 301                         if ((ro->ro_flags & RT_LLE_CACHE) != 0) {                                                                                                        
 302                                 lle = ro->ro_lle;                                                                                                                        
 303                                 if (lle != NULL &&                                                                                                                       
 304                                     (lle->la_flags & LLE_VALID) == 0) {                                                                                                  
 305                                         LLE_FREE(lle);                                                                                                                   
 306                                         lle = NULL;     /* redundant */                                                                                                  
 307                                         ro->ro_lle = NULL;                                                                                                               
 308                                 }                                                                                                                                        
 309                                 if (lle == NULL) {                                                                                                                       
 310                                         /* if we lookup, keep cache */                                                                                                   
 311                                         addref = 1;

I don't see any locking which prevents the double free.
Comment 15 Mark Johnston freebsd_committer freebsd_triage 2018-09-16 20:37:25 UTC
Before r338257 it looks like we held the inp write lock across the ip6_output() call, which should in principle prevent the double free.
Comment 16 Bjoern A. Zeeb freebsd_committer freebsd_triage 2018-09-16 21:01:29 UTC
(In reply to Mark Johnston from comment #15)

Look at the v4 version;  we haven't done this there in over 7 years either:
https://svnweb.freebsd.org/base/head/sys/netinet/udp_usrreq.c?r1=222487&r2=222488&

https://svnweb.freebsd.org/base/head/sys/netinet/udp_usrreq.c?annotate=337018#l1540


And we are doing the same kind of invalidations in the legacy output path:
https://svnweb.freebsd.org/base/head/sys/netinet/ip_output.c?annotate=328271&pathrev=328271#l285


Why are people stumbling over this on v6 more?  I kept suspecting that it has to do with RAs, one of the reasons I could not replicate it in a "noise-free" testbed.

The real problem is that if there are two consumers of the same lle and the lle gets freed, then the refcount on the lle isn't right, right?  And given they are on the same inp, the problem is that we are freeing them off the inp while we are holding a rlock only;  in turn this would mean we need wlocks for both ip_output and ip6_output calls all the time to prevent this?

I think this is partly why the original route caching was passing and updated route all the way back up the stack to a save place where we could alter the inp and only worked on a local copy compared to doing the in-place changes?

I am adding karels to Cc:
Comment 17 Mike Karels freebsd_committer freebsd_triage 2018-09-16 22:09:54 UTC
I will have some time to look at this later, but one quick comment: the LLE code was doing something tricky with "sharing" a reference to the LLE entry so it didn't have to lock when doing output.  That was obviously making some assumptions that might not be true now.  I'll look more later.
Comment 18 Mike Karels freebsd_committer freebsd_triage 2018-09-17 02:01:49 UTC
(In reply to Mark Johnston from comment #14)
Yes, clearly the two threads were both manipulating the route cache in the same inpcb.  I'm a little surprised to see two UDP6 sends on the same socket at the same time, but something is obviously doing that.  Clearly we need a write lock if we're going to modify the cache.  However, given this usage, it probably makes more sense not to cache; the destinations may well be different anyway.  I'll think about whether it makes sense to cache for connected UDP sockets; they may be able to use a read lock if the cache is valid.  In the meantime, I suggest the attached workaround to disable UDP6 route caching for now.
Comment 19 Mike Karels freebsd_committer freebsd_triage 2018-09-17 02:02:38 UTC
Created attachment 197157 [details]
patch to disable UDP6 route caching temporarily
Comment 20 Bjoern A. Zeeb freebsd_committer freebsd_triage 2018-09-17 08:37:34 UTC
(In reply to Mike Karels from comment #19)

Can you please also look at v4, as I pointed out the same should apply there as well.
Comment 21 Bjoern A. Zeeb freebsd_committer freebsd_triage 2018-09-17 08:58:08 UTC
Just adding Ryan's comment from 

https://lists.freebsd.org/pipermail/freebsd-net/2018-September/051563.html

which ae@ has pointed out on current.
Comment 22 Lev A. Serebryakov freebsd_committer freebsd_triage 2018-09-17 13:27:17 UTC
If it is epoch related race, then, may be, I have same problem with IPv4: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=231428
Comment 23 Bjoern A. Zeeb freebsd_committer freebsd_triage 2018-09-17 17:14:32 UTC
Also, given my previous suggested patch to try, I do have a core where the inp is marked INP_FREED.

That means in_pcbfree() was called which in turn means that (likely) udp6_detach() was called.  Hence the moving of RO_INVALIDATE_CACHE() to in_pcbfree_deferred().  Now in_pcbfree() does assert the INP_WLOCK_ASSERT() which still is an rw_lock and should not "race" with the output path even if we are holding an rlock only there.

However I also don't see exactly how this should happen given we were in the middle of a kern_sendit() syscall.  I guess solving one problem after the other might help.  Can people try the suggested change to disable the route caching in the udp/udp6_output() functions by passing NULL instead of passing &inp->inp_route / &inp->inp_route6 in and report back (in a day unless it crashes earlier still)?
Comment 24 Alexander Leidinger freebsd_committer freebsd_triage 2018-09-17 19:05:57 UTC
I have the "#if 0" patch from Mike in use. Still some panic:

Fatal trap 12: page fault while in kernel mode
cpuid = 5; apic id = 13
fault virtual address   = 0x98
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff8068cbf2
stack pointer           = 0x28:0xfffffe0128caa510
frame pointer           = 0x28:0xfffffe0128caa760
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 1658 (isc-worker0003)
trap number             = 12
panic: page fault
cpuid = 5
time = 1535835179
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0128caa1c0
vpanic() at vpanic+0x1a3/frame 0xfffffe0128caa220
panic() at panic+0x43/frame 0xfffffe0128caa280
trap_fatal() at trap_fatal+0x35f/frame 0xfffffe0128caa2d0
trap_pfault() at trap_pfault+0x49/frame 0xfffffe0128caa330
trap() at trap+0x2ba/frame 0xfffffe0128caa440
calltrap() at calltrap+0x8/frame 0xfffffe0128caa440
--- trap 0xc, rip = 0xffffffff8068cbf2, rsp = 0xfffffe0128caa510, rbp = 0xfffffe0128caa760 ---
ip6_output() at ip6_output+0xf82/frame 0xfffffe0128caa760
udp6_send() at udp6_send+0x702/frame 0xfffffe0128caa920
sosend_dgram() at sosend_dgram+0x346/frame 0xfffffe0128caa980
kern_sendit() at kern_sendit+0x170/frame 0xfffffe0128caaa10
sendit() at sendit+0x19e/frame 0xfffffe0128caaa60
sys_sendmsg() at sys_sendmsg+0x61/frame 0xfffffe0128caaac0
amd64_syscall() at amd64_syscall+0x254/frame 0xfffffe0128caabf0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0128caabf0
--- syscall (28, FreeBSD ELF64, sys_sendmsg), rip = 0x8015adf0a, rsp = 0x7fffdf9f7218, rbp = 0x7fffdf9f7250 ---
Uptime: 22h37m4s
Dumping 13174 out of 61352 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

warning: kld_current_sos: Can't read filename: Unknown error: -1

inferior.c:311: internal-error: struct inferior *find_inferior_pid(int): Assertion `pid != 0' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) [answered Y; input not from terminal]

Can't get more out of gdk 8.11...
Comment 25 Bjoern A. Zeeb freebsd_committer freebsd_triage 2018-09-17 19:12:12 UTC
I'll post two patches to apply in an hour or two.

One to fix one problem, and one to add assertion to catch the other problem in a more defined state.
Comment 26 Mike Karels freebsd_committer freebsd_triage 2018-09-17 22:36:20 UTC
(In reply to Bjoern A. Zeeb from comment #20)
v4 is the same, so if people are getting crashes there, the same workaround should work.  The eventual fix should be the same for both.
Comment 27 Mike Karels freebsd_committer freebsd_triage 2018-09-17 22:41:51 UTC
(In reply to Alexander Leidinger from comment #24)
Hmm, not too much to go on here.  If kgdb is not erroring out immediately, could you send the output of these (should work without the core file):

x/i 0xffffffff8068cbf2
x/i p6_output+0xf82

Thanks.
Comment 28 Alexander Leidinger freebsd_committer freebsd_triage 2018-09-18 07:58:33 UTC
(In reply to Mike Karels from comment #27)

x/i 0xffffffff8068cbf2:
ip6_output+3922: pop %rbp

x/i ip6_output+0xf82:
cmpl $0x1,0x98(%rdi)


I wonder what the problem with gdb 8.11 is. Or maybe I have an issue with my cores. If I just say "kgdb /boot/kernel/kernel" and then inside gdb "core vmcore.2" it tells me no core file, it can't recognize the file format.
Comment 29 Bjoern A. Zeeb freebsd_committer freebsd_triage 2018-09-18 16:36:14 UTC
Created attachment 197208 [details]
inp wlocking for route cache in udp6_output

Please try this one;  sorry it took a bit longer; I paniced my test system in other ways while trying to test this one.
Comment 30 Florian Smeets freebsd_committer freebsd_triage 2018-09-18 19:50:48 UTC
(In reply to Bjoern A. Zeeb from comment #29)

I applied the patch to my system that would panic every 1-2 hours without the patch. I will report back tomorrow.
Comment 31 Bjoern A. Zeeb freebsd_committer freebsd_triage 2018-09-18 22:27:55 UTC
https://reviews.freebsd.org/D17230  for everyone who wants to leave comments there.
Comment 32 Florian Smeets freebsd_committer freebsd_triage 2018-09-19 16:20:34 UTC
(In reply to Bjoern A. Zeeb from comment #29)
% uptime
 6:18PM  up 20:38, 4 users, load averages: 0.80, 0.39, 0.25

The patch seems to prevent/fix the panic.
Comment 33 commit-hook freebsd_committer freebsd_triage 2018-09-19 18:50:23 UTC
A commit references this bug:

Author: bz
Date: Wed Sep 19 18:49:38 UTC 2018
New revision: 338804
URL: https://svnweb.freebsd.org/changeset/base/338804

Log:
  Update udp6_output() inp locking to avoid concurrency issues with
  route cache updates.

  Bring over locking changes applied to udp_output() for the route cache
  in r297225 and fixed in r306559 which achieve multiple things:
  (1) acquire an exclusive inp lock earlier depending on the expected
      conditions; we add a comment explaining this in udp6,
  (2) having acquired the exclusive lock earlier eliminates a slight
      possible chance for a race condition which was present in v4 for
      multiple years as well and is now gone, and
  (3) only pass the inp_route6 to ip6_output() if we are holding an
      exclusive inp lock, so that possible route cache updates in case
      of routing table generation number changes can happen safely.
  In addition this change (as the legacy IP counterpart) decomposes the
  tracking of inp and pcbinfo lock and adds extra assertions, that the
  two together are acquired correctly.

  PR:		230950
  Reviewed by:	karels, markj
  Approved by:	re (gjb)
  Pointyhat to:	bz (for completely missing this bit)
  Differential Revision:	https://reviews.freebsd.org/D17230

Changes:
  head/sys/netinet6/udp6_usrreq.c