Unread portion of the kernel message buffer: Kernel page fault with the following non-sleepable locks held: exclusive rw tcpinp (tcpinp) r = 0 (0xfffff80007b1fe18) locked @ /usr/local/share/deploy-tools/RELENG_11/src/sys/netinet6/in6_pcb.c:1172 shared rw tcp (tcp) r = 0 (0xffffffff82ad2bd8) locked @ /usr/local/share/deploy-tools/RELENG_11/src/sys/netinet/tcp_input.c:802 stack backtrace: #0 0xffffffff80ab4d30 at witness_debugger+0x70 #1 0xffffffff80ab6017 at witness_warn+0x3d7 #2 0xffffffff80ec63d7 at trap_pfault+0x57 #3 0xffffffff80ec5a64 at trap+0x284 #4 0xffffffff80ea6161 at calltrap+0x8 #5 0xffffffff80c43c51 at tcp_twrespond+0x231 #6 0xffffffff80c436f5 at tcp_twstart+0x1f5 #7 0xffffffff80c34078 at tcp_do_segment+0x23c8 #8 0xffffffff80c310b4 at tcp_input+0xe44 #9 0xffffffff80c30221 at tcp6_input+0xf1 #10 0xffffffff80c82799 at ipsec6_common_input_cb+0x4c9 #11 0xffffffff80c97101 at esp_input_cb+0x671 #12 0xffffffff80ca9e69 at swcr_process+0xd69 #13 0xffffffff80ca6c2f at crypto_dispatch+0x7f #14 0xffffffff80c9605a at esp_input+0x4fa #15 0xffffffff80c8179b at ipsec_common_input+0x40b #16 0xffffffff80c8222d at ipsec6_common_input+0xcd #17 0xffffffff80c64070 at ip6_input+0xc70 Fatal trap 12: page fault while in kernel mode cpuid = 2; apic id = 02 fault virtual address = 0x1a fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80c65afc stack pointer = 0x28:0xfffffe0091f1e5f0 frame pointer = 0x28:0xfffffe0091f1e850 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (em0 que) I have static keys and policy (via ipsec.conf) which is in use for several years. Updated stable/10 to stable/11 whci crashes the machine as soon as there's traffic mathcing the IPSec policy. core dump available, just tell me how I can help – not able to diagnose furthere :-( -Harry
(In reply to Harald Schmalzbauer from comment #0) Missed helpful info I guess: #0 doadump (textdump=-18464194list *0xffffffff80c65afc 0xffffffff80c65afc is in ip6_output (/usr/local/share/deploy-tools/RELENG_11/src/sys/netinet6/ip6_output.c:1060). 1055 done: 1056 /* 1057 * Release the route if using our private route, or if 1058 * (with flowtable) we don't have our own reference. 1059 */ 1060 if (ro == &ip6route || ro->ro_flags & RT_NORTREF) 1061 RO_RTFREE(ro); 1062 return (error); 1063 1064 freehdrs: 40) at pcpu.h:221 #1 0xffffffff80393346 in db_fncall (dummy1=<value optimized out>, dummy2=<value optimized out>, dummy3=<value optimized out>, dummy4=<value optimized out>) at /usr/local/share/deploy-tools/RELENG_11/src/sys/ddb/db_command.c:568 #2 0xffffffff80392de9 in db_command (cmd_table=<value optimized out>) at /usr/local/share/deploy-tools/RELENG_11/src/sys/ddb/db_command.c:440 #3 0xffffffff80392b44 in db_command_loop () at /usr/local/share/deploy-tools/RELENG_11/src/sys/ddb/db_command.c:493 #4 0xffffffff80395a7b in db_trap (type=<value optimized out>, code=<value optimized out>) at /usr/local/share/deploy-tools/RELENG_11/src/sys/ddb/db_main.c:251 #5 0xffffffff80a96133 in kdb_trap (type=<value optimized out>, code=<value optimized out>, tf=<value optimized out>) at /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/subr_kdb.c:654 #6 0xffffffff80ec6331 in trap_fatal (frame=0xfffffe0091f1e540, eva=26) at /usr/local/share/deploy-tools/RELENG_11/src/sys/amd64/amd64/trap.c:836 #7 0xffffffff80ec657d in trap_pfault (frame=0xfffffe0091f1e540, usermode=0) at /usr/local/share/deploy-tools/RELENG_11/src/sys/amd64/amd64/trap.c:691 #8 0xffffffff80ec5a64 in trap (frame=0xfffffe0091f1e540) at /usr/local/share/deploy-tools/RELENG_11/src/sys/amd64/amd64/trap.c:442 #9 0xffffffff80ea6161 in calltrap () at /usr/local/share/deploy-tools/RELENG_11/src/sys/amd64/amd64/exception.S:236 #10 0xffffffff80c65afc in ip6_output (m0=<value optimized out>, opt=<value optimized out>, ro=<value optimized out>, flags=<value optimized out>, im6o=0x0, ifpp=0x0, inp=<value optimized out>) at /usr/local/share/deploy-tools/RELENG_11/src/sys/netinet6/ip6_output.c:1060 #11 0xffffffff80c43c51 in tcp_twrespond () at /usr/local/share/deploy-tools/RELENG_11/src/sys/netinet/tcp_timewait.c:594 #12 0xffffffff80c436f5 in tcp_twstart (tp=<value optimized out>) at /usr/local/share/deploy-tools/RELENG_11/src/sys/netinet/tcp_timewait.c:336 #13 0xffffffff80c34078 in tcp_do_segment (m=0xfffff8000732b400, th=<value optimized out>, so=<value optimized out>, tp=0xfffff80007b22000, drop_hdrlen=72, tlen=<value optimized out>, iptos=<value optimized out>, ti_locked=Cannot access memory at address 0x1 ) at /usr/local/share/deploy-tools/RELENG_11/src/sys/netinet/tcp_input.c:3141 #14 0xffffffff80c310b4 in tcp_input (mp=<value optimized out>, offp=<value optimized out>, proto=<value optimized out>) at /usr/local/share/deploy-tools/RELENG_11/src/sys/netinet/tcp_input.c:1442 #15 0xffffffff80c30221 in tcp6_input (mp=0xfffffe0091f1ebf8, offp=0xfffffe0091f1ebf4, proto=203) at /usr/local/share/deploy-tools/RELENG_11/src/sys/netinet/tcp_input.c:578 #16 0xffffffff80c82799 in ipsec6_common_input_cb (m=<value optimized out>, sav=<value optimized out>, skip=40, protoff=6) at /usr/local/share/deploy-tools/RELENG_11/src/sys/netipsec/ipsec_input.c:827 #17 0xffffffff80c97101 in esp_input_cb (crp=<value optimized out>) at /usr/local/share/deploy-tools/RELENG_11/src/sys/netipsec/xform_esp.c:626 #18 0xffffffff80ca9e69 in swcr_process (dev=<value optimized out>, crp=<value optimized out>, hint=<value optimized out>) at /usr/local/share/deploy-tools/RELENG_11/src/sys/opencrypto/cryptosoft.c:1185 #19 0xffffffff80ca6c2f in crypto_dispatch (crp=0xfffff80028f93840) at /usr/local/share/deploy-tools/RELENG_11/src/sys/opencrypto/crypto.c:807 #20 0xffffffff80c9605a in esp_input (m=<value optimized out>, sav=0xfffff80003ebb300, skip=<value optimized out>, protoff=<value optimized out>) at /usr/local/share/deploy-tools/RELENG_11/src/sys/netipsec/xform_esp.c:459 #21 0xffffffff80c8179b in ipsec_common_input (m=0xfffff8000732b400, skip=40, protoff=6, af=28, sproto=50) at /usr/local/share/deploy-tools/RELENG_11/src/sys/netipsec/ipsec_input.c:236 #22 0xffffffff80c8222d in ipsec6_common_input (mp=<value optimized out>, offp=<value optimized out>, proto=<value optimized out>) at /usr/local/share/deploy-tools/RELENG_11/src/sys/netipsec/ipsec_input.c:581 #23 0xffffffff80c64070 in ip6_input (m=0x3b003b00000001) at /usr/local/share/deploy-tools/RELENG_11/src/sys/netinet6/ip6_input.c:921 #24 0xffffffff80b5a7e0 in netisr_dispatch_src (proto=6, source=0, m=0xfffff8000732b400) at /usr/local/share/deploy-tools/RELENG_11/src/sys/net/netisr.c:1121 #25 0xffffffff80b4540a in ether_demux (ifp=<value optimized out>, m=0xffffffff81428eff) at /usr/local/share/deploy-tools/RELENG_11/src/sys/net/if_ethersubr.c:850 #26 0xffffffff80b46200 in ether_nh_input (m=<value optimized out>) at /usr/local/share/deploy-tools/RELENG_11/src/sys/net/if_ethersubr.c:639 #27 0xffffffff80b5a7e0 in netisr_dispatch_src (proto=5, source=0, m=0xfffff8000732b400) at /usr/local/share/deploy-tools/RELENG_11/src/sys/net/netisr.c:1121 #28 0xffffffff80b45772 in ether_input (ifp=<value optimized out>, m=0x0) at /usr/local/share/deploy-tools/RELENG_11/src/sys/net/if_ethersubr.c:759 #29 0xffffffff80b421fa in if_input (ifp=0xfffffe0091f1e5c8, sendmp=0xffffffff81428eff) at /usr/local/share/deploy-tools/RELENG_11/src/sys/net/if.c:3956 #30 0xffffffff80524acc in em_rxeof (count=98) at /usr/local/share/deploy-tools/RELENG_11/src/sys/dev/e1000/if_em.c:4873 #31 0xffffffff80526110 in em_handle_que (context=0xfffffe0000eb6000, pending=<value optimized out>) at /usr/local/share/deploy-tools/RELENG_11/src/sys/dev/e1000/if_em.c:1599 #32 0xffffffff80aa7a6c in taskqueue_run_locked (queue=<value optimized out>) at /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/subr_taskqueue.c:465 #33 0xffffffff80aa85b8 in taskqueue_thread_loop (arg=<value optimized out>) at /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/subr_taskqueue.c:719 #34 0xffffffff80a18904 in fork_exit (callout=0xffffffff80aa8530 <taskqueue_thread_loop>, arg=0xfffffe0000eb8730, frame=0xfffffe0091f1fac0) at /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_fork.c:103 #35 0xffffffff80ea669e in fork_trampoline () at /usr/local/share/deploy-tools/RELENG_11/src/sys/amd64/amd64/exception.S:611 #36 0x0000000000000000 in ?? ()
Can you show your ipsec.conf (with masked keys/addresses)?
(In reply to Andrey V. Elsukov from comment #2) Thanks for your attention! ipsec.conf of the affected machine: ############ # policies # ############ #----------------------------------------------------# # Encrypt any IPv6 LDAP traffic to/from own networks # #----------------------------------------------------# # No local IP, -> site1 spdadd -6 ::/0 2001:db8:abcd::/48[389] any -P out ipsec esp/transport//require; spdadd -6 2001:db8:abcd::/48[389] ::/0 any -P in ipsec esp/transport//require; # No local IP, -> site2 spdadd -6 ::/0 2001:db8:ef00::/48[389] any -P out ipsec esp/transport//require; spdadd -6 2001:db8:ef00::/48[389] ::/0 any -P in ipsec esp/transport//require; #-----------------------------------------------# # keys # #-----------------------------------------------# # key for host<->client add -6 1stf.q.d.n 2ndf.q.d.n esp 54320 -E rijndael-cbc 0x00000000000000000000000000000000 add -6 2ndf.q.d.n 1st.f.q.d.n esp 54321 -E rijndael-cbc 0x000000000000000000000000000000000000 netstat -f inet6 -nr: Destination Gateway Flags Netif Expire ::/96 ::1 UGRS lo0 default 2001:db8:abcd:2::1 UGS myif ::1 link#2 UH lo0 ::ffff:0.0.0.0/96 ::1 UGRS lo0 2001:db8:abcd:2::/64 link#1 U myif 2001:db8:abcd:2::3:1 link#1 UHS lo0 fe80::/10 ::1 UGRS lo0 fe80::%myif/64 link#1 U myif fe80::20c:29ff:feac:e09a%myif link#1 UHS lo0 fe80::%lo0/64 link#2 U lo0 fe80::1%lo0 link#2 UHS lo0 ff02::/16 ::1 UGRS lo0 Additional notes: 1st.f.q.d.n has the AAAA record 2001:db8:abcd::efgh:10, so default gateway sits on the trouted. netif "myif" is renamed (and masked) em0|vmx0. With both interfaces it's the same panic. Also mtu settings (which is 9000 on the interface and 1500 on the default route normally) don't influence the panic. No pf|ipfw involved. As soon as I fire up 'ldapsearch', I get the result followed by an immediate crash. Since I'd like to help testing that this will work in 11-RELEASE, I'll keep 11 installed on this host, but it means a medium severe outage, since no ldap users can login anymore. Hope the fix isn't too hard to find! -Harry
Created attachment 173191 [details] Proposed patch Can you test this patch? It looks like it should help. As I understand from your backtrace, your host is going to send TCP ACK via ip6_output, it handled by ip6_ipsec_output(), due to presence of corresponding IPSec policy. But, since *ro* pointer is zero and wasn't initialized yet, NULL pointer dereference occurs in listed by you check. Also ro_flags has 0x1a offset in the struct route_in6 (fault addres is 0x1a).
A commit references this bug: Author: ae Date: Tue Aug 2 12:18:06 UTC 2016 New revision: 303657 URL: https://svnweb.freebsd.org/changeset/base/303657 Log: Fix NULL pointer dereference. ro pointer can be NULL when IPSec consumes mbuf. PR: 211486 MFC after: 3 days Changes: head/sys/netinet6/ip6_output.c
(In reply to Andrey V. Elsukov from comment #4) Thank you very much for your quick solution, which seems to solve the problem. No more immediate crashes yet – just very briefly tested. Looking forward seeing MFC happen. Thanks, -Harry
A commit references this bug: Author: ae Date: Fri Aug 5 15:12:29 UTC 2016 New revision: 303768 URL: https://svnweb.freebsd.org/changeset/base/303768 Log: MFC r303657: Fix NULL pointer dereference. ro pointer can be NULL when IPSec consumes mbuf. PR: 211486 Approved by: re (gjb) Changes: _U stable/11/ stable/11/sys/netinet6/ip6_output.c
Fixed in head/ and stable/11. Thanks!