Bug 205592 - TCP processing in IPSec causes kernel panic
Summary: TCP processing in IPSec causes kernel panic
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 10.2-RELEASE
Hardware: amd64 Any
: --- Affects Some People
Assignee: freebsd-net (Nobody)
URL:
Keywords: crash, needs-qa
Depends on:
Blocks:
 
Reported: 2015-12-25 04:18 UTC by andrew
Modified: 2018-11-04 22:24 UTC (History)
4 users (show)

See Also:
koobs: mfc-stable10?
koobs: mfc-stable9?


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description andrew 2015-12-25 04:18:08 UTC
I'm trying to set up IPSec tunnel via D-Link hardware router and FreeBSD v10.2-p6 running as Xen PVHVM guest.
Tunnel itself establishes correctly, and traffic passes back and forth unless it is not a TCP traffic -- i.e. DNS over the tunnel works perfect, but attempt to use, f.e, ssh causes kernel panic.

Post-mortem dump backtrace shows:

===
(kgdb) bt
#0  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:263
#1  0xffffffff809b14aa in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:451
#2  0xffffffff809b1e8c in vpanic (fmt=0xffffffff8116770a "m_copydata, offset > size of mbuf chain", ap=0xfffffe0230631070) at /usr/src/sys/kern/kern_shutdown.c:758
#3  0xffffffff809b1969 in kassert_panic (fmt=0xffffffff8116770a "m_copydata, offset > size of mbuf chain") at /usr/src/sys/kern/kern_shutdown.c:646
#4  0xffffffff80a70e1c in m_copydata (m=0x0, off=21, len=3, cp=0xfffffe02306311f5 "") at /usr/src/sys/kern/uipc_mbuf.c:878
#5  0xffffffff80ceb488 in esp_input_cb (crp=0x0) at /usr/src/sys/netipsec/xform_esp.c:600
#6  0xffffffff80d0659b in crypto_done (crp=0xfffff8017a4b6bb0) at /usr/src/sys/opencrypto/crypto.c:1156
#7  0xffffffff80d094a8 in swcr_process (dev=0xfffff80005313b00, crp=0xfffff8017a4b6bb0, hint=0) at /usr/src/sys/opencrypto/cryptosoft.c:1054
#8  0xffffffff80d082b5 in CRYPTODEV_PROCESS (dev=0xfffff80005313b00, op=0xfffff8017a4b6bb0, flags=0) at cryptodev_if.h:53
#9  0xffffffff80d05e89 in crypto_invoke (cap=0xfffff80003fbd200, crp=0xfffff8017a4b6bb0, hint=0) at /usr/src/sys/opencrypto/crypto.c:1045
#10 0xffffffff80d05b9c in crypto_dispatch (crp=0xfffff8017a4b6bb0) at /usr/src/sys/opencrypto/crypto.c:806
#11 0xffffffff80ce9a18 in esp_input (m=0xfffff80135558200, sav=0xfffff801ed67d500, skip=20, protoff=9) at /usr/src/sys/netipsec/xform_esp.c:447
#12 0xffffffff80cc6d04 in ipsec_common_input (m=0xfffff80135558200, skip=20, protoff=9, af=2, sproto=50) at /usr/src/sys/netipsec/ipsec_input.c:231
#13 0xffffffff80cc63b6 in ipsec4_common_input (m=0xfffff80135558200) at /usr/src/sys/netipsec/ipsec_input.c:251
#14 0xffffffff80cc6e06 in esp4_input (m=0xfffff80135558200, off=20) at /usr/src/sys/netipsec/ipsec_input.c:271
#15 0xffffffff80b63630 in ip_input (m=0xfffff80135558200) at /usr/src/sys/netinet/ip_input.c:734
#16 0xffffffff80b18e2f in netisr_dispatch_src (proto=1, source=0, m=0xfffff80135558200) at /usr/src/sys/net/netisr.c:976
#17 0xffffffff80b193ff in netisr_dispatch (proto=1, m=0xfffff80135558200) at /usr/src/sys/net/netisr.c:1067
#18 0xffffffff80b06fe7 in ether_demux (ifp=0xfffff80005ec1800, m=0xfffff80135558200) at /usr/src/sys/net/if_ethersubr.c:851
#19 0xffffffff80b09168 in ether_input_internal (ifp=0xfffff80005ec1800, m=0xfffff80135558200) at /usr/src/sys/net/if_ethersubr.c:646
#20 0xffffffff80b0859d in ether_nh_input (m=0xfffff80135558200) at /usr/src/sys/net/if_ethersubr.c:658
#21 0xffffffff80b18e2f in netisr_dispatch_src (proto=9, source=0, m=0xfffff80135558200) at /usr/src/sys/net/netisr.c:976
#22 0xffffffff80b193ff in netisr_dispatch (proto=9, m=0xfffff80135558200) at /usr/src/sys/net/netisr.c:1067
#23 0xffffffff80b07583 in ether_input (ifp=0xfffff80005ec1800, m=0xfffff80135558200) at /usr/src/sys/net/if_ethersubr.c:716
#24 0xffffffff807d5fb0 in xn_rxeof (np=0xfffffe00026c0000) at /usr/src/sys/dev/xen/netfront/netfront.c:1077
#25 0xffffffff807d5b46 in xn_intr (xsc=0xfffffe00026c0000) at /usr/src/sys/dev/xen/netfront/netfront.c:1219
#26 0xffffffff80959c83 in intr_event_execute_handlers (p=0xfffff800052019d0, ie=0xfffff80005ed4e00) at /usr/src/sys/kern/kern_intr.c:1264
#27 0xffffffff8095ae7e in ithread_execute_handlers (p=0xfffff800052019d0, ie=0xfffff80005ed4e00) at /usr/src/sys/kern/kern_intr.c:1277
#28 0xffffffff8095ace2 in ithread_loop (arg=0xfffff80005ecb700) at /usr/src/sys/kern/kern_intr.c:1361
#29 0xffffffff80955610 in fork_exit (callout=0xffffffff8095abe0 <ithread_loop>, arg=0xfffff80005ecb700, frame=0xfffffe0230631c00) at /usr/src/sys/kern/kern_fork.c:1018
#30 0xffffffff80f746ee in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:611
#31 0x0000000000000000 in ?? ()
===

    ... that it is a NULL pointer dereference in m_copydata() called with NULL mbuf pointer form esp_input_cb().
Comment 1 Andrey V. Elsukov freebsd_committer 2015-12-25 06:52:22 UTC
Looking to the call trace, it seems impossible. I think someone has modified the memory, where mbuf is placed before it comes to m_copydata. We had such strange panics (they are not related to IPSec) with 9.x-STABLE and 40G cxgbe(4) cards. The traces also shows NULL pointers or in some cases mbufs with incorrect content in places, where this looks impossible.
Comment 2 andrew 2015-12-26 19:07:02 UTC
Additional experiments shown that such a panic does not actually connected with particular protocol, but with packet size.

According to my measures, 'ping -s 146' yet works, but 'ping -s 147' causes kernel panic.
As these figures have no connection with any kernel structures' size (at least known to me), I'm in doubt that they can clarify anything.

My kernel has been built with WITNESS and INVARIANTS, but there are no diagnostic messages at all.
Comment 3 Hiren Panchasara freebsd_committer 2016-12-22 17:43:05 UTC
Andrew,
Can you try to reproduce this on a more recent 11 based release?
Comment 4 andrew 2016-12-24 18:26:12 UTC
(In reply to Hiren Panchasara from comment #3)
Unfortunately, no: I don't see 11-R being ready for production yet.
Comment 5 Eugene Grosbein 2016-12-24 18:29:16 UTC
(In reply to andrew from comment #4)

And lack of testing does not make it more ready.
Comment 6 Eugene Grosbein freebsd_committer 2017-05-28 23:48:45 UTC
Please consider testing 11.1-PRERELEASE. It has network stack improved in many areas including new IPSEC implementation that is quite stable for me.
Comment 7 Eugene Grosbein freebsd_committer 2018-11-04 22:24:48 UTC
Andrew,

End of Live for 10.x release series approaches. Have you a chance to test 11.2-RELEASE?