I've stumbled upon a consistently reproducible panic with a GENERIC kernel on 11.1-RELEASE. For me it happens every time I try connect my Android device to the L2TP/IPSec (racoon+mpd) server running on the system in question. I don't know what's special about the Android VPN client, so I guess this panic might occur under other conditions too (other VPN clients don't trigger this). kern_kqueue() in sys/kern/kern_event.c:827 gets passed a struct thread pointer, and member td_ucred->cr_ruidinfo (null pointer) gets passed to chgkqcnt() in sys/kern/kern_resource.c:1430, which eventually gets dereferenced by atomic_fetchadd_long in sys/amd64/include/atomic.h. Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0x50 fault code = supervisor write data, page not present instruction pointer = 0x20:0xffffffff80a65f95 stack pointer = 0x28:0xfffffe004f1258c0 frame pointer = 0x28:0xfffffe004f1258d0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 816 (sshd) trap number = 12 panic: page fault cpuid = 1 KDB: stack backtrace: #0 0xffffffff80aada97 at kdb_backtrace+0x67 #1 0xffffffff80a6bb76 at vpanic+0x186 #2 0xffffffff80a6b9e3 at panic+0x43 #3 0xffffffff80edf832 at trap_fatal+0x322 #4 0xffffffff80edf889 at trap_pfault+0x49 #5 0xffffffff80edf0c6 at trap+0x286 #6 0xffffffff80ec3641 at calltrap+0x8 #7 0xffffffff80a21fbc at kern_kqueue+0x4c #8 0xffffffff80ee0394 at amd64_syscall+0x6c4 #9 0xffffffff80ec392b at Xfast_syscall+0xfb Uptime: 6m56s Dumping 122 out of 991 MB:..14%..27%..40%..53%..66%..79%..92% Reading symbols from /boot/kernel/ng_socket.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_socket.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_socket.ko Reading symbols from /boot/kernel/netgraph.ko...Reading symbols from /usr/lib/debug//boot/kernel/netgraph.ko.debug...done. done. Loaded symbols for /boot/kernel/netgraph.ko Reading symbols from /boot/kernel/ng_mppc.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_mppc.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_mppc.ko Reading symbols from /boot/kernel/rc4.ko...Reading symbols from /usr/lib/debug//boot/kernel/rc4.ko.debug...done. done. Loaded symbols for /boot/kernel/rc4.ko #0 doadump (textdump=<value optimized out>) at pcpu.h:222 222 __asm("movq %%gs:%1,%0" : "=r" (td) (kgdb) list *0xffffffff80a65f95 0xffffffff80a65f95 is in chgkqcnt (atomic.h:263). 258 */ 259 static __inline u_long 260 atomic_fetchadd_long(volatile u_long *p, u_long v) 261 { 262 263 __asm __volatile( 264 " " MPLOCKED " " 265 " xaddq %0,%1 ; " 266 "# atomic_fetchadd_long" 267 : "+r" (v), /* 0 */ Current language: auto; currently minimal (kgdb) backtrace #0 doadump (textdump=<value optimized out>) at pcpu.h:222 #1 0xffffffff80a6b6f1 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:366 #2 0xffffffff80a6bbb0 in vpanic (fmt=<value optimized out>, ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:759 #3 0xffffffff80a6b9e3 in panic (fmt=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:690 #4 0xffffffff80edf832 in trap_fatal (frame=0xfffffe004f125800, eva=80) at /usr/src/sys/amd64/amd64/trap.c:801 #5 0xffffffff80edf889 in trap_pfault (frame=0xfffffe004f125800, usermode=0) at pcpu.h:222 #6 0xffffffff80edf0c6 in trap (frame=0xfffffe004f125800) at /usr/src/sys/amd64/amd64/trap.c:421 #7 0xffffffff80ec3641 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236 #8 0xffffffff80a65f95 in chgkqcnt (uip=0x0, diff=1, max=9223372036854775807) at atomic.h:263 #9 0xffffffff80a21fbc in kern_kqueue (td=0xfffff80003e6a000, flags=0, fcaps=0x0) at /usr/src/sys/kern/kern_event.c:837 #10 0xffffffff80ee0394 in amd64_syscall (td=0xfffff80003e6a000, traced=0) at subr_syscall.c:135 #11 0xffffffff80ec392b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:396 #12 0x00000008021aae9a in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) frame 9 #9 0xffffffff80a21fbc in kern_kqueue (td=0xfffff80003e6a000, flags=0, fcaps=0x0) at /usr/src/sys/kern/kern_event.c:837 837 if (!chgkqcnt(cred->cr_ruidinfo, 1, lim_cur(td, RLIMIT_KQUEUES))) (kgdb) list 832 struct ucred *cred; 833 int fd, error; 834 835 fdp = td->td_proc->p_fd; 836 cred = td->td_ucred; 837 if (!chgkqcnt(cred->cr_ruidinfo, 1, lim_cur(td, RLIMIT_KQUEUES))) 838 return (ENOMEM); 839 840 error = falloc_caps(td, &fp, &fd, flags, fcaps); 841 if (error != 0) { (kgdb) print *cred $1 = {cr_ref = 2168608751, cr_uid = 4294967295, cr_ruid = 21168128, cr_svuid = 0, cr_ngroups = 0, cr_rgid = 0, cr_svgid = 4, cr_uidinfo = 0x0, cr_ruidinfo = 0x0, cr_prison = 0xfffff8002676b4b0, cr_loginclass = 0xfffff80003945280, cr_flags = 60052120, cr_pspare2 = 0xfffff80003640448, cr_label = 0x0, cr_audit = {ai_auid = 0, ai_mask = {am_success = 0, am_failure = 2158116784}, ai_termid = {at_port = 4294967295, at_type = 2158116848, at_addr = 0xfffff80003640474}, ai_asid = -1, ai_flags = 18446735277673415680}, cr_groups = 0x0, cr_agroups = 0, cr_smallgroups = 0xfffff8000364049c}
After more tests, I think the kqueue related backtrace from earlier might just be a symptom of something much stranger. Is it possible the IPSec stack is overwriting kernel memory? The kernel panic consistently happens shortly after the Android VPN client and racoon finish establishing ISAKMP and IPSec SAs, but before MPD sees any L2TP requests. What is inconsistent is the contents of the backtrace. I have rebuilt a GENERIC kernel with -O0 to try make debugging easier, and below are a few kgdb sessions of separate panics that were triggered under the same condition of an Android VPN client trying to connect. Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0x1100000094 fault code = supervisor write data, page not present instruction pointer = 0x20:0xffffffff814c43b6 stack pointer = 0x28:0xfffffe00003b3af0 frame pointer = 0x28:0xfffffe00003b3b00 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 814 (sshd) trap number = 12 panic: page fault cpuid = 1 KDB: stack backtrace: #0 0xffffffff810e9be7 at kdb_backtrace+0xa7 #1 0xffffffff8107a129 at vpanic+0x249 #2 0xffffffff81079ee0 at vpanic+0 #3 0xffffffff817cb38a at trap_fatal+0x60a #4 0xffffffff817cb538 at trap_pfault+0x188 #5 0xffffffff817ca6e1 at trap+0x751 #6 0xffffffff817cb9ba at trap_check+0x4a #7 0xffffffff817a07e1 at calltrap+0x8 #8 0xffffffff814bfc42 at refcount_release+0x22 #9 0xffffffff814bfbae at key_freesp+0x2e #10 0xffffffff814b7744 at ipsec_invalidate_cache+0xc4 #11 0xffffffff814b622a at ipsec_getpcbpolicy+0x16a #12 0xffffffff814b6005 at ipsec_hdrsiz_inpcb+0x25 #13 0xffffffff8141e57d at tcp_output+0x9dd #14 0xffffffff81439c80 at tcp_usr_send+0x350 #15 0xffffffff8116a051 at sosend_generic+0xeb1 #16 0xffffffff8116a31d at sosend+0x5d #17 0xffffffff8112e7c7 at soo_write+0x87 Uptime: 6m9s Dumping 123 out of 981 MB:..13%..26%..39%..52%..65%..78%..91% Reading symbols from /boot/kernel/ng_socket.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_socket.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_socket.ko Reading symbols from /boot/kernel/netgraph.ko...Reading symbols from /usr/lib/debug//boot/kernel/netgraph.ko.debug...done. done. Loaded symbols for /boot/kernel/netgraph.ko Reading symbols from /boot/kernel/ng_mppc.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_mppc.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_mppc.ko Reading symbols from /boot/kernel/rc4.ko...Reading symbols from /usr/lib/debug//boot/kernel/rc4.ko.debug...done. done. Loaded symbols for /boot/kernel/rc4.ko #0 doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:298 298 dumptid = curthread->td_tid; (kgdb) list *0xffffffff814c43b6 0xffffffff814c43b6 is in atomic_fetchadd_int (atomic.h:245). 240 */ 241 static __inline u_int 242 atomic_fetchadd_int(volatile u_int *p, u_int v) 243 { 244 245 __asm __volatile( 246 " " MPLOCKED " " 247 " xaddl %0,%1 ; " 248 "# atomic_fetchadd_int" 249 : "+r" (v), /* 0 */ Current language: auto; currently minimal (kgdb) backtrace #0 doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:298 #1 0xffffffff81079668 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:366 #2 0xffffffff8107a17f in vpanic (fmt=0xffffffff81b40308 "%s", ap=0xfffffe00003b34a0) at /usr/src/sys/kern/kern_shutdown.c:759 #3 0xffffffff81079ee0 in panic (fmt=0xffffffff81b40308 "%s") at /usr/src/sys/kern/kern_shutdown.c:690 #4 0xffffffff817cb38a in trap_fatal (frame=0xfffffe00003b3a30, eva=73014444180) at /usr/src/sys/amd64/amd64/trap.c:801 #5 0xffffffff817cb538 in trap_pfault (frame=0xfffffe00003b3a30, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:683 #6 0xffffffff817ca6e1 in trap (frame=0xfffffe00003b3a30) at /usr/src/sys/amd64/amd64/trap.c:421 #7 0xffffffff817cb9ba in trap_check (frame=0xfffffe00003b3a30) at /usr/src/sys/amd64/amd64/trap.c:602 #8 0xffffffff817a07e1 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236 #9 0xffffffff814c43b6 in atomic_fetchadd_int (p=0x1100000094, v=4294967295) at atomic.h:250 #10 0xffffffff814bfc42 in refcount_release (count=0x1100000094) at refcount.h:62 #11 0xffffffff814bfbae in key_freesp (spp=0xfffffe00003b3b58) at /usr/src/sys/netipsec/key.c:1076 #12 0xffffffff814b7744 in ipsec_invalidate_cache (inp=0xfffff80003f2fae0, dir=2) at /usr/src/sys/netipsec/ipsec.c:317 #13 0xffffffff814b622a in ipsec_getpcbpolicy (inp=0xfffff80003f2fae0, dir=2) at /usr/src/sys/netipsec/ipsec.c:463 #14 0xffffffff814b6005 in ipsec_hdrsiz_inpcb (inp=0xfffff80003f2fae0) at /usr/src/sys/netipsec/ipsec.c:1151 #15 0xffffffff8141e57d in tcp_output (tp=0xfffff80003e58820) at /usr/src/sys/netinet/tcp_output.c:560 #16 0xffffffff81439c80 in tcp_usr_send (so=0xfffff80013035000, flags=0, m=0xfffff80013538400, nam=0x0, control=0x0, td=0xfffff80003cbd000) at /usr/src/sys/netinet/tcp_usrreq.c:967 #17 0xffffffff8116a051 in sosend_generic (so=0xfffff80013035000, addr=0x0, uio=0xfffffe00003b47a8, top=0xfffff80013538400, control=0x0, flags=0, td=0xfffff80003cbd000) at /usr/src/sys/kern/uipc_socket.c:1360 #18 0xffffffff8116a31d in sosend (so=0xfffff80013035000, addr=0x0, uio=0xfffffe00003b47a8, top=0x0, control=0x0, flags=0, td=0xfffff80003cbd000) at /usr/src/sys/kern/uipc_socket.c:1405 #19 0xffffffff8112e7c7 in soo_write (fp=0xfffff80003869b90, uio=0xfffffe00003b47a8, active_cred=0xfffff80003d5a600, flags=0, td=0xfffff80003cbd000) at /usr/src/sys/kern/sys_socket.c:146 #20 0xffffffff81121e1a in fo_write (fp=0xfffff80003869b90, uio=0xfffffe00003b47a8, active_cred=0xfffff80003d5a600, flags=0, td=0xfffff80003cbd000) at file.h:307 #21 0xffffffff8111dc36 in dofilewrite (td=0xfffff80003cbd000, fd=3, fp=0xfffff80003869b90, auio=0xfffffe00003b47a8, offset=-1, flags=0) at /usr/src/sys/kern/sys_generic.c:592 #22 0xffffffff8111d786 in kern_writev (td=0xfffff80003cbd000, fd=3, auio=0xfffffe00003b47a8) at /usr/src/sys/kern/sys_generic.c:506 #23 0xffffffff8111d65f in sys_write (td=0xfffff80003cbd000, uap=0xfffffe00003b4a58) at /usr/src/sys/kern/sys_generic.c:420 #24 0xffffffff817cc7b1 in syscallenter (td=0xfffff80003cbd000, sa=0xfffffe00003b4a48) at subr_syscall.c:135 #25 0xffffffff817cbd0a in amd64_syscall (td=0xfffff80003cbd000, traced=0) at /usr/src/sys/amd64/amd64/trap.c:902 #26 0xffffffff817a0acb in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:396 #27 0x00000008021c34aa in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) frame 11 #11 0xffffffff814bfbae in key_freesp (spp=0xfffffe00003b3b58) at /usr/src/sys/netipsec/key.c:1076 1076 if (SP_DELREF(sp) == 0) (kgdb) list 1071 key_freesp(struct secpolicy **spp) 1072 { 1073 struct secpolicy *sp = *spp; 1074 1075 IPSEC_ASSERT(sp != NULL, ("null sp")); 1076 if (SP_DELREF(sp) == 0) 1077 return; 1078 1079 KEYDBG(IPSEC_STAMP, 1080 printf("%s: last reference to SP(%p)\n", __func__, sp)); (kgdb) print sp $1 = (struct secpolicy *) 0x1100000000 (kgdb) print *sp Cannot access memory at address 0x1100000000 Below panic seemed to occur just as a tried to perform a "racoonctl show-sa ipsec", while the VPN client was busy trying to connect. Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x4c fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff814c43fe stack pointer = 0x28:0xfffffe0000336b70 frame pointer = 0x28:0xfffffe0000336bd0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 688 (racoon) trap number = 12 panic: page fault cpuid = 0 KDB: stack backtrace: #0 0xffffffff810e9be7 at kdb_backtrace+0xa7 #1 0xffffffff8107a129 at vpanic+0x249 #2 0xffffffff81079ee0 at vpanic+0 #3 0xffffffff817cb38a at trap_fatal+0x60a #4 0xffffffff817cb538 at trap_pfault+0x188 #5 0xffffffff817ca6e1 at trap+0x751 #6 0xffffffff817cb9ba at trap_check+0x4a #7 0xffffffff817a07e1 at calltrap+0x8 #8 0xffffffff814d297a at key_setdumpsa+0x40a #9 0xffffffff814cb182 at key_dump+0x412 #10 0xffffffff814c31e4 at key_parse+0xce4 #11 0xffffffff814d86ac at key_output+0x1ac #12 0xffffffff8125af8c at raw_usend+0x8c #13 0xffffffff814d9bb1 at key_send+0x51 #14 0xffffffff8116a051 at sosend_generic+0xeb1 #15 0xffffffff8116a31d at sosend+0x5d #16 0xffffffff811769bc at kern_sendit+0x42c #17 0xffffffff81176e86 at sendit+0x146 Uptime: 5m22s Dumping 124 out of 981 MB:..13%..26%..39%..52%..65%..78%..91% Reading symbols from /boot/kernel/ng_socket.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_socket.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_socket.ko Reading symbols from /boot/kernel/netgraph.ko...Reading symbols from /usr/lib/debug//boot/kernel/netgraph.ko.debug...done. done. Loaded symbols for /boot/kernel/netgraph.ko Reading symbols from /boot/kernel/ng_mppc.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_mppc.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_mppc.ko Reading symbols from /boot/kernel/rc4.ko...Reading symbols from /usr/lib/debug//boot/kernel/rc4.ko.debug...done. done. Loaded symbols for /boot/kernel/rc4.ko #0 doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:298 298 dumptid = curthread->td_tid; (kgdb) list *0xffffffff814c43fe 0xffffffff814c43fe is in key_setsadbaddr (/usr/src/sys/netipsec/key.c:3693). 3688 struct mbuf *m; 3689 struct sadb_address *p; 3690 size_t len; 3691 3692 len = PFKEY_ALIGN8(sizeof(struct sadb_address)) + 3693 PFKEY_ALIGN8(saddr->sa_len); 3694 m = m_get2(len, M_NOWAIT, MT_DATA, 0); 3695 if (m == NULL) 3696 return (NULL); 3697 m_align(m, len); Current language: auto; currently minimal (kgdb) backtrace #0 doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:298 #1 0xffffffff81079668 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:366 #2 0xffffffff8107a17f in vpanic (fmt=0xffffffff81b40308 "%s", ap=0xfffffe0000336520) at /usr/src/sys/kern/kern_shutdown.c:759 #3 0xffffffff81079ee0 in panic (fmt=0xffffffff81b40308 "%s") at /usr/src/sys/kern/kern_shutdown.c:690 #4 0xffffffff817cb38a in trap_fatal (frame=0xfffffe0000336ab0, eva=76) at /usr/src/sys/amd64/amd64/trap.c:801 #5 0xffffffff817cb538 in trap_pfault (frame=0xfffffe0000336ab0, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:683 #6 0xffffffff817ca6e1 in trap (frame=0xfffffe0000336ab0) at /usr/src/sys/amd64/amd64/trap.c:421 #7 0xffffffff817cb9ba in trap_check (frame=0xfffffe0000336ab0) at /usr/src/sys/amd64/amd64/trap.c:602 #8 0xffffffff817a07e1 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236 #9 0xffffffff814c43fe in key_setsadbaddr (exttype=6, saddr=0x4c, prefixlen=255 '?', ul_proto=255) at /usr/src/sys/netipsec/key.c:3693 #10 0xffffffff814d297a in key_setdumpsa (sav=0xfffff8000381b700, type=10 '\n', satype=3 '\003', seq=0, pid=688) at /usr/src/sys/netipsec/key.c:3469 #11 0xffffffff814cb182 in key_dump (so=0xfffff800039a8360, m=0xfffff8003d4fab00, mhp=0xfffffe0000336fd8) at /usr/src/sys/netipsec/key.c:7509 #12 0xffffffff814c31e4 in key_parse (m=0xfffff8003d4fab00, so=0xfffff800039a8360) at /usr/src/sys/netipsec/key.c:7861 #13 0xffffffff814d86ac in key_output (m=0xfffff8003d4fab00, so=0xfffff800039a8360) at /usr/src/sys/netipsec/keysock.c:128 #14 0xffffffff8125af8c in raw_usend (so=0xfffff800039a8360, flags=0, m=0xfffff8003d4fab00, nam=0x0, control=0x0, td=0xfffff800039f5560) at /usr/src/sys/net/raw_usrreq.c:238 #15 0xffffffff814d9bb1 in key_send (so=0xfffff800039a8360, flags=0, m=0xfffff8003d4fab00, nam=0x0, control=0x0, td=0xfffff800039f5560) at /usr/src/sys/netipsec/keysock.c:492 #16 0xffffffff8116a051 in sosend_generic (so=0xfffff800039a8360, addr=0x0, uio=0xfffffe00003376a0, top=0xfffff8003d4fab00, control=0x0, flags=0, td=0xfffff800039f5560) at /usr/src/sys/kern/uipc_socket.c:1360 #17 0xffffffff8116a31d in sosend (so=0xfffff800039a8360, addr=0x0, uio=0xfffffe00003376a0, top=0x0, control=0x0, flags=0, td=0xfffff800039f5560) at /usr/src/sys/kern/uipc_socket.c:1405 #18 0xffffffff811769bc in kern_sendit (td=0xfffff800039f5560, s=12, mp=0xfffffe00003377b0, flags=0, control=0x0, segflg=UIO_USERSPACE) at /usr/src/sys/kern/uipc_syscalls.c:873 #19 0xffffffff81176e86 in sendit (td=0xfffff800039f5560, s=12, mp=0xfffffe00003377b0, flags=0) at /usr/src/sys/kern/uipc_syscalls.c:793 #20 0xffffffff81176d37 in sys_sendto (td=0xfffff800039f5560, uap=0xfffffe0000337a58) at /usr/src/sys/kern/uipc_syscalls.c:924 #21 0xffffffff817cc7b1 in syscallenter (td=0xfffff800039f5560, sa=0xfffffe0000337a48) at subr_syscall.c:135 #22 0xffffffff817cbd0a in amd64_syscall (td=0xfffff800039f5560, traced=0) at /usr/src/sys/amd64/amd64/trap.c:902 #23 0xffffffff817a0acb in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:396 #24 0x00000008013c9dfa in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) frame 10 #10 0xffffffff814d297a in key_setdumpsa (sav=0xfffff8000381b700, type=10 '\n', satype=3 '\003', seq=0, pid=688) at /usr/src/sys/netipsec/key.c:3469 3469 m = key_setsadbaddr(SADB_EXT_ADDRESS_DST, (kgdb) list 3464 if (!m) 3465 goto fail; 3466 break; 3467 3468 case SADB_EXT_ADDRESS_DST: 3469 m = key_setsadbaddr(SADB_EXT_ADDRESS_DST, 3470 &sav->sah->saidx.dst.sa, 3471 FULLMASK, IPSEC_ULPROTO_ANY); 3472 if (!m) 3473 goto fail; (kgdb) print sav $1 = (struct secasvar *) 0xfffff8000381b700 (kgdb) print *sav $2 = {spi = 778989686, flags = 779777128, seq = 1819047270, pid = 1768120678, ivlen = 1663985518, sah = 0x0, key_auth = 0x0, key_enc = 0x0, replay = 0x0, natt = 0x0, lock = 0x0, tdb_xform = 0x0, tdb_encalgxform = 0x0, tdb_authalgxform = 0x0, tdb_compalgxform = 0x0, tdb_cryptoid = 0, alg_auth = 0 '\0', alg_enc = 0 '\0', alg_comp = 0 '\0', state = 0 '\0', lft_c = 0x0, lft_h = 0x0, lft_s = 0x0, created = 0, firstused = 0, chain = {tqe_next = 0x0, tqe_prev = 0x0}, spihash = {le_next = 0x0, le_prev = 0x0}, drainq = {le_next = 0x0, le_prev = 0x0}, cntr = 0, refcnt = 0} (kgdb) print sav->sah $3 = (struct secashead *) 0x0 Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0x50 fault code = supervisor write data, page not present instruction pointer = 0x20:0xffffffff8106e168 stack pointer = 0x28:0xfffffe00002bf620 frame pointer = 0x28:0xfffffe00002bf630 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 948 (sshd) trap number = 12 panic: page fault cpuid = 1 KDB: stack backtrace: #0 0xffffffff810e9be7 at kdb_backtrace+0xa7 #1 0xffffffff8107a129 at vpanic+0x249 #2 0xffffffff81079ee0 at vpanic+0 #3 0xffffffff817cb38a at trap_fatal+0x60a #4 0xffffffff817cb538 at trap_pfault+0x188 #5 0xffffffff817ca6e1 at trap+0x751 #6 0xffffffff817cb9ba at trap_check+0x4a #7 0xffffffff817a07e1 at calltrap+0x8 #8 0xffffffff8106defd at chglimit+0x3d #9 0xffffffff8106e09b at chgkqcnt+0x3b #10 0xffffffff80fefe55 at kern_kqueue+0x75 #11 0xffffffff80fefdd7 at sys_kqueue+0x37 #12 0xffffffff817cc7b1 at syscallenter+0x961 #13 0xffffffff817cbd0a at amd64_syscall+0x2a #14 0xffffffff817a0acb at Xfast_syscall+0xfb Uptime: 45m34s Dumping 123 out of 981 MB:..13%..26%..39%..52%..65%..78%..91% Reading symbols from /boot/kernel/ng_socket.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_socket.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_socket.ko Reading symbols from /boot/kernel/netgraph.ko...Reading symbols from /usr/lib/debug//boot/kernel/netgraph.ko.debug...done. done. Loaded symbols for /boot/kernel/netgraph.ko Reading symbols from /boot/kernel/ng_mppc.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_mppc.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_mppc.ko Reading symbols from /boot/kernel/rc4.ko...Reading symbols from /usr/lib/debug//boot/kernel/rc4.ko.debug...done. done. Loaded symbols for /boot/kernel/rc4.ko #0 doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:298 298 dumptid = curthread->td_tid; (kgdb) list *0xffffffff8106e168 0xffffffff8106e168 is in atomic_fetchadd_long (atomic.h:263). 258 */ 259 static __inline u_long 260 atomic_fetchadd_long(volatile u_long *p, u_long v) 261 { 262 263 __asm __volatile( 264 " " MPLOCKED " " 265 " xaddq %0,%1 ; " 266 "# atomic_fetchadd_long" 267 : "+r" (v), /* 0 */ Current language: auto; currently minimal (kgdb) backtrace #0 doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:298 #1 0xffffffff81079668 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:366 #2 0xffffffff8107a17f in vpanic (fmt=0xffffffff81b40308 "%s", ap=0xfffffe00002befd0) at /usr/src/sys/kern/kern_shutdown.c:759 #3 0xffffffff81079ee0 in panic (fmt=0xffffffff81b40308 "%s") at /usr/src/sys/kern/kern_shutdown.c:690 #4 0xffffffff817cb38a in trap_fatal (frame=0xfffffe00002bf560, eva=80) at /usr/src/sys/amd64/amd64/trap.c:801 #5 0xffffffff817cb538 in trap_pfault (frame=0xfffffe00002bf560, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:683 #6 0xffffffff817ca6e1 in trap (frame=0xfffffe00002bf560) at /usr/src/sys/amd64/amd64/trap.c:421 #7 0xffffffff817cb9ba in trap_check (frame=0xfffffe00002bf560) at /usr/src/sys/amd64/amd64/trap.c:602 #8 0xffffffff817a07e1 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236 #9 0xffffffff8106e168 in atomic_fetchadd_long (p=0x50, v=1) at atomic.h:268 #10 0xffffffff8106defd in chglimit (uip=0x0, limit=0x50, diff=1, max=9223372036854775807, name=0xffffffff81e06486 "kqcnt") at /usr/src/sys/kern/kern_resource.c:1376 #11 0xffffffff8106e09b in chgkqcnt (uip=0x0, diff=1, max=9223372036854775807) at /usr/src/sys/kern/kern_resource.c:1433 #12 0xffffffff80fefe55 in kern_kqueue (td=0xfffff8000364e000, flags=0, fcaps=0x0) at /usr/src/sys/kern/kern_event.c:837 #13 0xffffffff80fefdd7 in sys_kqueue (td=0xfffff8000364e000, uap=0xfffffe00002bfa58) at /usr/src/sys/kern/kern_event.c:813 #14 0xffffffff817cc7b1 in syscallenter (td=0xfffff8000364e000, sa=0xfffffe00002bfa48) at subr_syscall.c:135 #15 0xffffffff817cbd0a in amd64_syscall (td=0xfffff8000364e000, traced=0) at /usr/src/sys/amd64/amd64/trap.c:902 #16 0xffffffff817a0acb in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:396 #17 0x00000008021aae9a in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) frame 12 #12 0xffffffff80fefe55 in kern_kqueue (td=0xfffff8000364e000, flags=0, fcaps=0x0) at /usr/src/sys/kern/kern_event.c:837 837 if (!chgkqcnt(cred->cr_ruidinfo, 1, lim_cur(td, RLIMIT_KQUEUES))) (kgdb) list 832 struct ucred *cred; 833 int fd, error; 834 835 fdp = td->td_proc->p_fd; 836 cred = td->td_ucred; 837 if (!chgkqcnt(cred->cr_ruidinfo, 1, lim_cur(td, RLIMIT_KQUEUES))) 838 return (ENOMEM); 839 840 error = falloc_caps(td, &fp, &fd, flags, fcaps); 841 if (error != 0) { (kgdb) print *td $1 = {td_lock = 0xffffffff825e8d00, td_proc = 0xfffff8000333f000, td_plist = {tqe_next = 0x0, tqe_prev = 0xfffff8000333f010}, td_runq = {tqe_next = 0xfffff80003e80560, tqe_prev = 0xffffffff825e8f88}, td_slpq = {tqe_next = 0x0, tqe_prev = 0xfffff80003323480}, td_lockq = {tqe_next = 0x0, tqe_prev = 0x0}, td_hash = {le_next = 0x0, le_prev = 0xfffffe0000d4c7b8}, td_cpuset = 0xfffff800032ce000, td_sel = 0xfffff800032d8680, td_sleepqueue = 0xfffff80003323480, td_turnstile = 0xfffff8000320f540, td_rlqe = 0xfffff80003874820, td_umtxq = 0xfffff80003642c80, td_vm_dom_policy = {seq = 0, p = { policy = VM_POLICY_NONE, domain = -1}}, td_tid = 100087, padding1 = 0xfffff8000364e0a0, padding2 = 0xfffff8000364e0c0, td_lend_user_pri = 255 '?', td_flags = 67174406, td_inhibitors = 0, td_pflags = 0, td_dupfd = 0, td_sqqueue = 0, td_wchan = 0x0, td_wmesg = 0x0, td_owepreempt = 0 '\0', td_tsqueue = 0 '\0', td_locks = 0, td_rw_rlocks = 0, td_lk_slocks = 0, td_stopsched = 1, td_blocked = 0x0, td_lockname = 0x0, td_contested = {lh_first = 0x0}, td_sleeplocks = 0x0, td_intr_nesting_level = 0, td_pinned = 0, td_ucred = 0xfffff8000390f700, td_limit = 0xfffff8000381b400, td_slptick = 0, td_blktick = 0, td_swvoltick = -2145350148, td_swinvoltick = -2145350138, td_cow = 127, td_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = { tv_sec = 0, tv_usec = 0}, ru_maxrss = 7268, ru_ixrss = 592, ru_idrss = 80, ru_isrss = 256, ru_minflt = 274, ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 2, ru_nsignals = 0, ru_nvcsw = 1, ru_nivcsw = 1}, td_rux = {rux_runtime = 0, rux_uticks = 0, rux_sticks = 0, rux_iticks = 0, rux_uu = 0, rux_su = 0, rux_tu = 0}, td_incruntime = 44205853, td_runtime = 44205853, td_pticks = 1, td_sticks = 1, td_iticks = 0, td_uticks = 1, td_intrval = 0, td_oldsigmask = {__bits = 0xfffff8000364e254}, td_generation = 2, td_sigstk = {ss_sp = 0x0, ss_size = 0, ss_flags = 4}, td_xsig = 0, td_profil_addr = 0, td_profil_ticks = 0, td_name = 0xfffff8000364e294 "sshd", td_fpop = 0x0, td_dbgflags = 0, td_dbgksi = {ksi_link = {tqe_next = 0x0, tqe_prev = 0x0}, ksi_info = {si_signo = 0, si_errno = 0, si_code = 0, si_pid = 0, si_uid = 0, si_status = 0, si_addr = 0x0, si_value = {sival_int = 0, sival_ptr = 0x0, sigval_int = 0, sigval_ptr = 0x0}, _reason = {_fault = {_trapno = 0}, _timer = {_timerid = 0, _overrun = 0}, _mesgq = {_mqd = 0}, _poll = {_band = 0}, __spare__ = {__spare1__ = 0, __spare2__ = 0xfffff8000364e2f8}}}, ksi_flags = 0, ksi_sigq = 0x0}, td_ng_outbound = 0, td_osd = {osd_nslots = 0, osd_slots = 0x0, osd_next = {le_next = 0x0, le_prev = 0x0}}, td_map_def_user = 0x0, td_dbg_forked = 0, td_vp_reserv = 0, td_no_sleeping = 0, td_dom_rr_idx = 0, td_su = 0x0, td_rtcgen = 0, td_sigmask = {__bits = 0xfffff8000364e374}, td_rqindex = 30 '\036', td_base_pri = 120 'x', td_priority = 120 'x', td_pri_class = 3 '\003', td_user_pri = 121 'y', td_base_user_pri = 121 'y', td_dbg_sc_code = 0, td_dbg_sc_narg = 0, td_rb_list = 0, td_rbp_list = 0, td_rb_inact = 0, td_pcb = 0xfffffe00002bfb80, td_state = TDS_RUNNING, td_uretoff = {tdu_retval = 0xfffff8000364e3c0, tdu_off = 0}, td_cowgen = 1, td_slpcallout = {c_links = {le = {le_next = 0x0, le_prev = 0xfffff8001317d3d8}, sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0xfffff8001317d3d8}}, c_time = 8284352236906, c_precision = 16106127360, c_arg = 0xfffff8000364e000, c_func = 0xffffffff811018e0 <sleepq_timeout>, c_lock = 0x0, c_flags = 0, c_iflags = 272, c_cpu = 0}, td_frame = 0xfffffe00002bfac0, td_kstack_obj = 0xfffff800037b8a50, td_kstack = 18446741874689163264, td_kstack_pages = 4, td_critnest = 1, td_md = {md_spinlock_count = 1, md_saved_flags = 646, md_spurflt_addr = 34427155648, md_invl_gen = {gen = 0, link = {le_next = 0x0, le_prev = 0xffffffff826df868}}}, td_ar = 0x0, td_lprof = 0xfffff8000364e470, td_dtrace = 0xfffff80003e35600, td_errno = 0, td_vnet = 0x0, td_vnet_lpush = 0x0, td_intr_frame = 0x0, td_rfppwait_p = 0xfffff8000396d588, td_ma = 0x0, td_ma_cnt = 0, td_emuldata = 0x0, td_lastcpu = 1, td_oncpu = 1, td_sleeptimo = 0, ---Type <return> to continue, or q <return> to quit--- td_sigqueue = {sq_signals = {__bits = 0xfffff8000364e4d8}, sq_kill = {__bits = 0xfffff8000364e4e8}, sq_ptrace = { __bits = 0xfffff8000364e4f8}, sq_list = {tqh_first = 0x0, tqh_last = 0xfffff8000364e508}, sq_proc = 0xfffff8000333f000, sq_flags = 1}} (kgdb) print *cred $2 = {cr_ref = 2178945375, cr_uid = 4294967295, cr_ruid = 21168128, cr_svuid = 0, cr_ngroups = 0, cr_rgid = 0, cr_svgid = 4, cr_uidinfo = 0x0, cr_ruidinfo = 0x0, cr_prison = 0xfffff800130f3060, cr_loginclass = 0xfffff80013032d80, cr_flags = 318975384, cr_pspare2 = 0xfffff8000390f748, cr_label = 0x0, cr_audit = {ai_auid = 0, ai_mask = {am_success = 0, am_failure = 2164206432}, ai_termid = {at_port = 4294967295, at_type = 2164206608, at_addr = 0xfffff8000390f774}, ai_asid = -1, ai_flags = 18446735277676361472}, cr_groups = 0x0, cr_agroups = 0, cr_smallgroups = 0xfffff8000390f79c} Thank you for looking at this!!
I've rebuilt with IPSEC_DEBUG enabled. The only error unique to Android clients is this: Aug 27 18:33:08 <kern.crit> vpn kernel: esp_input: payload of 76 octets not a multiple of 16 octets, SA 1.2.3.4/04bc89ee Just before the panic occurs. Any help appreciated.
Ok, finally I have a half-solution. Just disable sha256 from racoon's phase2 (sainfo) configuration, and Android won't crash FreeBSD with its broken IKEv1 implementation. More info on Android's broken sha256: https://wiki.strongswan.org/issues/1472 Would be nice if FreeBSD kernel handled this gracefully. If it behaves like this with a broken VPN client, what is possible with a malicious VPN client? Hmmm...
Can you provide raconn's logs to see what it did before the panic?
Did you try to add "options INVARIANTS" to your kernel? It enables many checks and probably some message from KASSERT will shed some light on what is going on here? Also, if it is not easy to do further debugging, I would be glad to get some instructions on how to configure simple test lab using mpd+l2tp+racoon to be able reproduce this problem locally.
Sure, happy to help more! I'm just struggling with building in INVARIANTS support: cc -target x86_64-unknown-freebsd11.1 --sysroot=/usr/obj/usr/src/tmp -B/usr/obj/usr/src/tmp/usr/bin -c -O0 -g -nostdinc -I. -I/usr/sr c/sys -I/usr/src/sys/contrib/libfdt -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-omit-frame-pointer -mno-omit-lea f-frame-pointer -MD -MF.depend.cam.o -MTcam.o -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float -fno-asynchronous-unwind-t ables -ffreestanding -fwrapv -fstack-protector -gdwarf-2 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-protot ypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -D__printf__=__freebsd_kprintf__ -Wmissing-include-dirs -fdiagnos tics-show-option -Wno-unknown-pragmas -Wno-error-tautological-compare -Wno-error-empty-body -Wno-error-parentheses-equality -Wno-err or-unused-function -Wno-error-pointer-sign -Wno-error-shift-negative-value -Wno-error-address-of-packed-member -mno-aes -mno-avx -st d=iso9899:1999 -Werror /usr/src/sys/cam/cam.c In file included from /usr/src/sys/cam/cam.c:45: In file included from /usr/src/sys/cam/cam_ccb.h:44: In file included from /usr/src/sys/cam/nvme/nvme_all.h:32: /usr/src/sys/dev/nvme/nvme.h:922:16: warning: taking address of packed member 'cdw10' of class or structure 'nvme_command' may result in an unaligned pointer value [-Waddress-of-packed-member] *(uint64_t *)&cmd->cdw10 = lba; ^~~~~~~~~~ In file included from /usr/src/sys/cam/cam.c:45: /usr/src/sys/cam/cam_ccb.h:1392:2: error: implicit declaration of function 'kassert_panic' is invalid in C99 [-Werror,-Wimplicit-function-declaration] KASSERT((flags & CAM_DIR_MASK) == CAM_DIR_BOTH, ^ /usr/src/sys/sys/systm.h:88:3: note: expanded from macro 'KASSERT' kassert_panic msg; \ ^ In file included from /usr/src/sys/cam/cam.c:45: /usr/src/sys/cam/cam_ccb.h:1392:2: error: this function declaration is not a prototype [-Werror,-Wstrict-prototypes] /usr/src/sys/sys/systm.h:88:3: note: expanded from macro 'KASSERT' kassert_panic msg; \ ^ 1 warning and 2 errors generated. *** Error code 1 My kernel config is: include GENERIC ident IPSECDBG options IPSEC_DEBUG options INVARIANTS In case it's easier for you to setup a test lab, the following racoon.conf should allow you to reproduce this with an Android L2TP/IPSec VPN session: path include "/usr/local/etc/racoon"; path pre_shared_key "/usr/local/etc/racoon/psk.txt"; listen { adminsock "/var/db/racoon/racoon.sock"; } timer { natt_keepalive 55 seconds; } remote anonymous { exchange_mode main; passive on; doi ipsec_doi; my_identifier address 1.2.3.4; verify_cert off; lifetime time 24 hours; nat_traversal on; generate_policy on; dpd_delay 55; dpd_retry 10; nonce_size 64; proposal_check claim; proposal { encryption_algorithm aes 256; hash_algorithm sha256; authentication_method pre_shared_key; dh_group modp4096; lifetime time 24 hours; } proposal { encryption_algorithm aes 256; hash_algorithm sha256; authentication_method pre_shared_key; dh_group modp2048; lifetime time 24 hours; } proposal { encryption_algorithm aes 256; hash_algorithm sha256; authentication_method pre_shared_key; dh_group modp1024; lifetime time 24 hours; } proposal { encryption_algorithm aes; hash_algorithm sha1; authentication_method pre_shared_key; dh_group modp2048; lifetime time 24 hours; } proposal { encryption_algorithm aes; hash_algorithm sha1; authentication_method pre_shared_key; dh_group modp1024; lifetime time 24 hours; } } sainfo anonymous { encryption_algorithm aes 256, aes; authentication_algorithm hmac_sha256, hmac_sha1; compression_algorithm deflate; lifetime time 8 hours; } You don't need to bother setting up MPD. FreeBSD should crash before any L2TP packets reach userland. Mine crashes without MPD running.
(In reply to Aragon Gouveia from comment #6) What version of Android do you use? Mine inists on Aggressive mode and racoon cannnot run Aggressive mode with your configuration due to different dh_group settings.
(In reply to Eugene Grosbein from comment #7) Android 6.0.1. Maybe try enable aggressive mode in racoon. The problem occurs after phase 2 negotiation, so maybe aggressive mode will still allow the bug to trigger. The main thing to look out for in racoon logs is that both peers agree on sha256 for phase 2.
(In reply to Aragon Gouveia from comment #8) For unknown reason, I cannot get Android 6 to establish a connection with FreeBSD 11.1@r321371 ipsec-tools-0.8.2_1 using your configuration nor to reproduce a crash, racoon complains: "X.X.X.X give up to get IPsec-SA due to time up to wait"
(In reply to Aragon Gouveia from comment #6) You also need INVARIANTS_SUPPORT. See https://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html
Thanks Konstantin! Invariant kernel displayed an additional error at panic. Here's some kgdb info: Unread portion of the kernel message buffer: esp_input: payload of 76 octets not a multiple of 16 octets, SA 197.189.200.16/019052a1 panic: attempt to free non DEAD SA 0xfffff8000398db00 cpuid = 1 KDB: stack backtrace: #0 0xffffffff81007cf7 at kdb_backtrace+0xa7 #1 0xffffffff80fa22b9 at vpanic+0x249 #2 0xffffffff80fa1e44 at kassert_panic+0xc4 #3 0xffffffff81362a0e at key_delsav+0x7e #4 0xffffffff81362981 at key_freesav+0xb1 #5 0xffffffff8137f934 at udp_ipsec_input+0x2d4 #6 0xffffffff812f5a9b at udp_append+0x1ab #7 0xffffffff812f5739 at udp_input+0x12a9 #8 0xffffffff811ec014 at ip_input+0xd44 #9 0xffffffff81142587 at netisr_dispatch_src+0x167 #10 0xffffffff81142b87 at netisr_dispatch+0x27 #11 0xffffffff811196a0 at ether_demux+0x2d0 #12 0xffffffff8111b369 at ether_input_internal+0x7e9 #13 0xffffffff8111ab6f at ether_nh_input+0xbf #14 0xffffffff81142587 at netisr_dispatch_src+0x167 #15 0xffffffff81142b87 at netisr_dispatch+0x27 #16 0xffffffff81119c8f at ether_input+0x9f #17 0xffffffff80cfad69 at vtnet_rxq_input+0x169 Uptime: 3m59s Dumping 121 out of 982 MB:..14%..27%..40%..53%..66%..80%..93% Reading symbols from /boot/kernel/ng_socket.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_socket.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_socket.ko Reading symbols from /boot/kernel/netgraph.ko...Reading symbols from /usr/lib/debug//boot/kernel/netgraph.ko.debug...done. done. Loaded symbols for /boot/kernel/netgraph.ko Reading symbols from /boot/kernel/ng_mppc.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_mppc.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_mppc.ko Reading symbols from /boot/kernel/rc4.ko...Reading symbols from /usr/lib/debug//boot/kernel/rc4.ko.debug...done. done. Loaded symbols for /boot/kernel/rc4.ko #0 doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:298 298 dumptid = curthread->td_tid; (kgdb) backtrace #0 doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:298 #1 0xffffffff80fa1908 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:366 #2 0xffffffff80fa230f in vpanic (fmt=0xffffffff81ca7401 "attempt to free non DEAD SA %p", ap=0xfffffe003c1f9d40) at /usr/src/sys/kern/kern_shutdown.c:759 #3 0xffffffff80fa1e44 in kassert_panic (fmt=0xffffffff81ca7401 "attempt to free non DEAD SA %p") at /usr/src/sys/kern/kern_shutdown.c:649 #4 0xffffffff81362a0e in key_delsav (sav=0xfffff8000398db00) at /usr/src/sys/netipsec/key.c:2969 #5 0xffffffff81362981 in key_freesav (psav=0xfffffe003c1f9e40) at /usr/src/sys/netipsec/key.c:1237 #6 0xffffffff8137f934 in udp_ipsec_input (m=0xfffff80003c04600, off=9, af=2) at /usr/src/sys/netipsec/udpencap.c:210 #7 0xffffffff812f5a9b in udp_append (inp=0xfffff800039bd3a0, ip=0xfffff80003bf401a, n=0xfffff80003c04600, off=28, udp_in=0xfffffe003c1fa268) at /usr/src/sys/netinet/udp_usrreq.c:333 #8 0xffffffff812f5739 in udp_input (mp=0xfffffe003c1fa3c8, offp=0xfffffe003c1fa3a0, proto=17) at /usr/src/sys/netinet/udp_usrreq.c:705 #9 0xffffffff811ec014 in ip_input (m=0x0) at /usr/src/sys/netinet/ip_input.c:823 #10 0xffffffff81142587 in netisr_dispatch_src (proto=1, source=0, m=0xfffff80003c04600) at /usr/src/sys/net/netisr.c:1120 #11 0xffffffff81142b87 in netisr_dispatch (proto=1, m=0xfffff80003c04600) at /usr/src/sys/net/netisr.c:1211 #12 0xffffffff811196a0 in ether_demux (ifp=0xfffff800034f3800, m=0xfffff80003c04600) at /usr/src/sys/net/if_ethersubr.c:850 #13 0xffffffff8111b369 in ether_input_internal (ifp=0xfffff800034f3800, m=0xfffff80003c04600) at /usr/src/sys/net/if_ethersubr.c:639 #14 0xffffffff8111ab6f in ether_nh_input (m=0xfffff80003c04600) at /usr/src/sys/net/if_ethersubr.c:669 #15 0xffffffff81142587 in netisr_dispatch_src (proto=5, source=0, m=0xfffff80003c04600) at /usr/src/sys/net/netisr.c:1120 #16 0xffffffff81142b87 in netisr_dispatch (proto=5, m=0xfffff80003c04600) at /usr/src/sys/net/netisr.c:1211 #17 0xffffffff81119c8f in ether_input (ifp=0xfffff800034f3800, m=0xfffff80003c04600) at /usr/src/sys/net/if_ethersubr.c:759 #18 0xffffffff80cfad69 in vtnet_rxq_input (rxq=0xfffff8000331fd00, m=0xfffff80003c04600, hdr=0xfffffe003c1fa890) at /usr/src/sys/dev/virtio/network/if_vtnet.c:1745 #19 0xffffffff80cfa7ba in vtnet_rxq_eof (rxq=0xfffff8000331fd00) at /usr/src/sys/dev/virtio/network/if_vtnet.c:1831 #20 0xffffffff80cfdb18 in vtnet_rx_vq_intr (xrxq=0xfffff8000331fd00) at /usr/src/sys/dev/virtio/network/if_vtnet.c:1876 #21 0xffffffff80cf2dd9 in virtqueue_intr (vq=0xfffffe0000dde000) at /usr/src/sys/dev/virtio/virtqueue.c:439 #22 0xffffffff80cf72cd in vtpci_vq_intr (xvq=0xfffffe0000dde000) at /usr/src/sys/dev/virtio/pci/virtio_pci.c:1316 #23 0xffffffff80f440a8 in intr_event_execute_handlers (p=0xfffff800032e2588, ie=0xfffff8000331f700) at /usr/src/sys/kern/kern_intr.c:1262 #24 0xffffffff80f44637 in ithread_execute_handlers (p=0xfffff800032e2588, ie=0xfffff8000331f700) at /usr/src/sys/kern/kern_intr.c:1275 #25 0xffffffff80f444bc in ithread_loop (arg=0xfffff800035690e0) at /usr/src/sys/kern/kern_intr.c:1356 #26 0xffffffff80f3fcbb in fork_exit (callout=0xffffffff80f443c0 <ithread_loop>, arg=0xfffff800035690e0, frame=0xfffffe003c1faac0) at /usr/src/sys/kern/kern_fork.c:1042 #27 0xffffffff816129ae in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:611 ---Type <return> to continue, or q <return> to quit--- #28 0x0000000000000000 in ?? () (kgdb) frame 4 #4 0xffffffff81362a0e in key_delsav (sav=0xfffff8000398db00) at /usr/src/sys/netipsec/key.c:2969 2969 IPSEC_ASSERT(sav->state == SADB_SASTATE_DEAD, (kgdb) list 2964 */ 2965 static void 2966 key_delsav(struct secasvar *sav) 2967 { 2968 IPSEC_ASSERT(sav != NULL, ("null sav")); 2969 IPSEC_ASSERT(sav->state == SADB_SASTATE_DEAD, 2970 ("attempt to free non DEAD SA %p", sav)); 2971 IPSEC_ASSERT(sav->refcnt == 0, ("reference count %u > 0", 2972 sav->refcnt)); 2973 (kgdb) print *sav $1 = {spi = 2706542593, flags = 0, seq = 2437428875, pid = 831, ivlen = 16, sah = 0xfffff800032d1800, key_auth = 0xfffff8000368b8b0, key_enc = 0xfffff8000368b8c0, replay = 0xfffff800032de5a0, natt = 0xfffff800037ffd80, lock = 0xfffff8000365ed60, tdb_xform = 0xffffffff821b8078, tdb_encalgxform = 0xffffffff821b9440, tdb_authalgxform = 0xffffffff821b97c0, tdb_compalgxform = 0x0, tdb_cryptoid = 432345564227567617, alg_auth = 5 '\005', alg_enc = 12 '\f', alg_comp = 0 '\0', state = 1 '\001', lft_c = 0xfffff800322013c0, lft_h = 0xfffff8000365ea20, lft_s = 0xfffff800034ef1a0, created = 1504104802, firstused = 0, chain = {tqe_next = 0x0, tqe_prev = 0xfffff800032d1888}, spihash = {le_next = 0x0, le_prev = 0xfffff80003693c78}, drainq = {le_next = 0x0, le_prev = 0x0}, cntr = 0, refcnt = 0} (kgdb) frame 6 #6 0xffffffff8137f934 in udp_ipsec_input (m=0xfffff80003c04600, off=9, af=2) at /usr/src/sys/netipsec/udpencap.c:210 210 key_freesav(&sav); (kgdb) list 205 * will do this anyway, so don't touch them here. 206 */ 207 ESPSTAT_INC(esps_input); 208 error = (*sav->tdb_xform->xf_input)(m, sav, hlen, off); 209 if (error != 0) 210 key_freesav(&sav); 211 212 return (EINPROGRESS); /* Consumed by IPsec. */ 213 } 214 The associated racoon debug log (I've just edited peer IP addresses): https://paste.ubuntu.com/25432438/
(In reply to Aragon Gouveia from comment #11) > Thanks Konstantin! Invariant kernel displayed an additional error at panic. > Here's some kgdb info: > > Unread portion of the kernel message buffer: > > esp_input: payload of 76 octets not a multiple of 16 octets, SA > 197.189.200.16/019052a1 > panic: attempt to free non DEAD SA 0xfffff8000398db00 > cpuid = 1 > KDB: stack backtrace: > #0 0xffffffff81007cf7 at kdb_backtrace+0xa7 > #1 0xffffffff80fa22b9 at vpanic+0x249 > #2 0xffffffff80fa1e44 at kassert_panic+0xc4 > #3 0xffffffff81362a0e at key_delsav+0x7e > #4 0xffffffff81362981 at key_freesav+0xb1 > #5 0xffffffff8137f934 at udp_ipsec_input+0x2d4 > #6 0xffffffff812f5a9b at udp_append+0x1ab > #7 0xffffffff812f5739 at udp_input+0x12a9 > #8 0xffffffff811ec014 at ip_input+0xd44 > #9 0xffffffff81142587 at netisr_dispatch_src+0x167 > #10 0xffffffff81142b87 at netisr_dispatch+0x27 > #11 0xffffffff811196a0 at ether_demux+0x2d0 > #12 0xffffffff8111b369 at ether_input_internal+0x7e9 > #13 0xffffffff8111ab6f at ether_nh_input+0xbf > #14 0xffffffff81142587 at netisr_dispatch_src+0x167 > #15 0xffffffff81142b87 at netisr_dispatch+0x27 > #16 0xffffffff81119c8f at ether_input+0x9f > #17 0xffffffff80cfad69 at vtnet_rxq_input+0x169 > (kgdb) list > 205 * will do this anyway, so don't touch them here. > 206 */ > 207 ESPSTAT_INC(esps_input); > 208 error = (*sav->tdb_xform->xf_input)(m, sav, hlen, off); > 209 if (error != 0) > 210 key_freesav(&sav); > 211 > 212 return (EINPROGRESS); /* Consumed by IPsec. */ > 213 } > 214 Thanks. The NAT-T part of code was missed in the r318734.
Created attachment 185935 [details] Proposed patch
Can you confirm that this patch fixed your problem?
A commit references this bug: Author: ae Date: Fri Sep 1 11:51:08 UTC 2017 New revision: 323086 URL: https://svnweb.freebsd.org/changeset/base/323086 Log: Fix possible double releasing for SA reference. This is missing part of r318734. When crypto subsystem returns error the xform code handles an error independently. PR: 221849 MFC after: 5 days Changes: head/sys/netipsec/udpencap.c
(In reply to Andrey V. Elsukov from comment #14) Sorry for the wait. I just tested it now, and it works! Android's broken ESP packets just get dropped now, as expected. Thank you, Andrey, everyone.
A commit references this bug: Author: ae Date: Wed Sep 6 10:21:28 UTC 2017 New revision: 323216 URL: https://svnweb.freebsd.org/changeset/base/323216 Log: MFC r323086: Fix possible double releasing for SA reference. This is missing part of r318734. When crypto subsystem returns error the xform code handles an error independently. PR: 221849 Changes: _U stable/11/ stable/11/sys/netipsec/udpencap.c
batch change: For bugs that match the following - Status Is In progress AND - Untouched since 2018-01-01. AND - Affects Base System OR Documentation DO: Reset to open status. Note: I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed.
Fixed 7 months ago.