Bug 256610 - Kernel panic with ngtee
Summary: Kernel panic with ngtee
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.0-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: John Baldwin
URL:
Keywords: crash
Depends on:
Blocks:
 
Reported: 2021-06-14 23:33 UTC by Niels Bakker
Modified: 2022-10-12 00:49 UTC (History)
5 users (show)

See Also:
jhb: mfc-stable13+


Attachments
savecore (71.19 KB, text/plain)
2021-06-14 23:33 UTC, Niels Bakker
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Niels Bakker 2021-06-14 23:33:55 UTC
Created attachment 225814 [details]
savecore

With the following in /etc/rc.conf:

ipfw_netflow_enable="YES"
ipfw_netflow_rule=1111
ipfw_netflow_ip="192.168.1.2"
ipfw_netflow_port=9995
ipfw_netflow_version=9

I can reliably get a kernel panic as soon as there is network traffic between a host on the LAN and a jail whose IPv4 address is attached to a VLAN interface which also has a bridge by vm-bhyve. 192.168.1.2 is another jail on the local machine. This is a kernel compiled with INVARIANTS but it also panics without, and did so on 13.0-RELEASE.

#9  memmove_erms () at /usr/src/sys/amd64/amd64/support.S:547
#10 0xffffffff80c80f18 in m_dup (m=0xfffff8021e841200, 
    m@entry=0xfffff801345d9d00, how=how@entry=1)
    at /usr/src/sys/kern/uipc_mbuf.c:722
#11 0xffffffff834ab3c1 in ng_ipfw_input (m0=<optimized out>, 
    fwa=0xfffffe0144090300, tee=false) at /usr/src/sys/netgraph/ng_ipfw.c:324
#12 0xffffffff82936df0 in ipfw_check_packet (m0=0xfffffe01440904e8, 
    ifp=0xfffff80004e5b800, flags=131072, ruleset=<optimized out>, 
    inp=0xfffff8004cff35b8) at /usr/src/sys/netpfil/ipfw/ip_fw_pfil.c:297
#13 0xffffffff80d2a8c7 in pfil_run_hooks (head=<optimized out>, p=..., 
    ifp=ifp@entry=0xfffff80004e5b800, flags=flags@entry=131072, 
    inp=inp@entry=0xfffff8004cff35b8) at /usr/src/sys/net/pfil.c:187

Possibly related to Bug #256439.
Comment 1 Mark Johnston freebsd_committer freebsd_triage 2021-06-15 01:34:08 UTC
Does it happen if you set sysctl kern.ipc.mb_use_ext_pgs=0?
Comment 2 Niels Bakker 2021-06-15 10:08:43 UTC
(In reply to Mark Johnston from comment #1)
Yes, that helped! No immediate crash when starting and accessing the problematic jail after this sysctl change.
Comment 3 Mark Johnston freebsd_committer freebsd_triage 2021-06-21 15:11:13 UTC
John, Drew: should we simply modify m_dup() to handle unmapped mbufs and return a mapped chain?
Comment 4 John Baldwin freebsd_committer freebsd_triage 2021-06-21 16:01:40 UTC
I'm curious how this is using unmapped mbufs?  Does ngtee use sendfile(2) under the hood?  While we could patch m_dup(), I don't know we want to enforce the policy that the dup is always unmapped?  That said, I think fixing m_dup is probably a single line change to replace the 'bcopy' with 'm_copydata' as is done in m_defrag():

diff --git a/sys/kern/uipc_mbuf.c b/sys/kern/uipc_mbuf.c
index b9e716b411be..1a2098c7c536 100644
--- a/sys/kern/uipc_mbuf.c
+++ b/sys/kern/uipc_mbuf.c
@@ -719,7 +719,7 @@ m_dup(const struct mbuf *m, int how)
 		while (n->m_len < nsize && m != NULL) {
 			int chunk = min(nsize - n->m_len, m->m_len - moff);
 
-			bcopy(m->m_data + moff, n->m_data + n->m_len, chunk);
+			m_copydata(m, moff, chunk, n->m_data + n->m_len);
 			moff += chunk;
 			n->m_len += chunk;
 			remain -= chunk;
Comment 5 Mark Johnston freebsd_committer freebsd_triage 2021-06-21 16:11:07 UTC
(In reply to John Baldwin from comment #4)
The full stack looks like this:

#8  <signal handler called>
#9  memmove_erms () at /usr/src/sys/amd64/amd64/support.S:547
#10 0xffffffff80c80f18 in m_dup (m=0xfffff8021e841200, 
    m@entry=0xfffff801345d9d00, how=how@entry=1)
    at /usr/src/sys/kern/uipc_mbuf.c:722
#11 0xffffffff834ab3c1 in ng_ipfw_input (m0=<optimized out>, 
    fwa=0xfffffe0144090300, tee=false) at /usr/src/sys/netgraph/ng_ipfw.c:324
#12 0xffffffff82936df0 in ipfw_check_packet (m0=0xfffffe01440904e8, 
    ifp=0xfffff80004e5b800, flags=131072, ruleset=<optimized out>, 
    inp=0xfffff8004cff35b8) at /usr/src/sys/netpfil/ipfw/ip_fw_pfil.c:297
#13 0xffffffff80d2a8c7 in pfil_run_hooks (head=<optimized out>, p=..., 
    ifp=ifp@entry=0xfffff80004e5b800, flags=flags@entry=131072, 
    inp=inp@entry=0xfffff8004cff35b8) at /usr/src/sys/net/pfil.c:187
#14 0xffffffff80d9f287 in ip_output_pfil (mp=0xfffffe01440904e8, 
    ifp=0xfffff80004e5b800, flags=0, inp=0xfffff8004cff35b8, 
    dst=0xfffff8004cff3760, fibnum=<optimized out>, error=<optimized out>)
    at /usr/src/sys/netinet/ip_output.c:130
#15 ip_output (m=m@entry=0xfffff801345d9d00, opt=<optimized out>, 
    ro=<optimized out>, flags=<optimized out>, imo=imo@entry=0x0, 
    inp=<optimized out>) at /usr/src/sys/netinet/ip_output.c:705
#16 0xffffffff80db8cab in tcp_output (tp=<optimized out>)
    at /usr/src/sys/netinet/tcp_output.c:1544
#17 0xffffffff80dccfff in tcp_usr_ready (so=<optimized out>, 
    m=0xfffff802e78b3d00, count=<optimized out>)
    at /usr/src/sys/netinet/tcp_usrreq.c:1303
#18 0xffffffff80bef395 in sendfile_iodone (arg=arg@entry=0xfffff8035b9cdb00, 
    pa=<optimized out>, pa@entry=0x0, count=<optimized out>, count@entry=0, 
    error=<optimized out>) at /usr/src/sys/kern/kern_sendfile.c:399
#19 0xffffffff80beebc9 in vn_sendfile (fp=<optimized out>, sockfd=34, 
    hdr_uio=0x0, trl_uio=0x0, offset=<optimized out>, nbytes=<optimized out>, 
    sent=0xfffffe0144090a88, flags=0, td=0xfffffe0145810ac0)
    at /usr/src/sys/kern/kern_sendfile.c:1194
#20 0xffffffff80bef7c7 in fo_sendfile (fp=0xfffff8002b911034, sockfd=0, 
    hdr_uio=0x7cc, trl_uio=0x7cc, offset=-8795362095052, 
    nbytes=722828843999576199, sent=0xfffffe0144090a88, flags=730927156, 
    td=0xfffffe0145810ac0) at /usr/src/sys/sys/file.h:411
#21 sendfile (td=0xfffffe0145810ac0, uap=0xfffffe0145810ea8, 
    compat=<optimized out>) at /usr/src/sys/kern/kern_sendfile.c:1324
#22 0xffffffff81083ede in syscallenter (td=<optimized out>)
    at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:189

> While we could patch m_dup(), I don't know we want to enforce the policy that the dup is always unmapped?

Did you mean mapped?  In any case, yeah, I was wondering if it's really a good idea to return a mapped chain.  But if a consumer is m_dup()ing something then it probably intends to write to the dup anyway.  m_copypacket() should be used otherwise.
Comment 6 John Baldwin freebsd_committer freebsd_triage 2021-06-21 17:39:46 UTC
Ok, I think I'm happy with m_dup always generating a mapped chain then.  Presumably even if you did unmapped you'd still need to be allocating pages, etc.  m_dup() already handles jumbo frames suboptimally (perhaps) by degrading them to a chain of 2k clusters, so duplicating an unmapped mbuf into a chain of mapped 2k clusters is consistent with that existing behavior.
Comment 7 Mark Johnston freebsd_committer freebsd_triage 2021-06-21 18:39:12 UTC
Niels, are you able to test the patch from comment 4?  (With kern.ipc.mb_use_ext_pgs set back to 1 of course.)
Comment 8 Niels Bakker 2021-06-21 19:58:13 UTC
(In reply to Mark Johnston from comment #7)
No crash with this patch applied while doing the same thing that used to trigger a panic.
Comment 9 commit-hook freebsd_committer freebsd_triage 2021-07-26 21:11:50 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=be79f30d6c3e353856d4f82227b270abc26be702

commit be79f30d6c3e353856d4f82227b270abc26be702
Author:     John Baldwin <jhb@FreeBSD.org>
AuthorDate: 2021-07-26 21:03:28 +0000
Commit:     John Baldwin <jhb@FreeBSD.org>
CommitDate: 2021-07-26 21:09:16 +0000

    m_dup: Handle unmapped mbufs as an input mbuf.

    Use m_copydata() instead of a direct bcopy() when copying data out of
    a source mbuf into a newly-allocated mbuf.

    PR:           256610
    Reported by:  Niels Bakker <niels=freebsd@bakker.net>
    Reviewed by:  markj
    MFC after:    2 weeks

 sys/kern/uipc_mbuf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Comment 10 commit-hook freebsd_committer freebsd_triage 2021-08-11 21:14:16 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=3e4aeeb95df217fdd9ec21e5a86ec61d1b2abc59

commit 3e4aeeb95df217fdd9ec21e5a86ec61d1b2abc59
Author:     John Baldwin <jhb@FreeBSD.org>
AuthorDate: 2021-07-26 21:03:28 +0000
Commit:     John Baldwin <jhb@FreeBSD.org>
CommitDate: 2021-08-11 19:18:46 +0000

    m_dup: Handle unmapped mbufs as an input mbuf.

    Use m_copydata() instead of a direct bcopy() when copying data out of
    a source mbuf into a newly-allocated mbuf.

    PR:           256610
    Reported by:  Niels Bakker <niels=freebsd@bakker.net>
    Reviewed by:  markj
    MFC after:    2 weeks

    (cherry picked from commit be79f30d6c3e353856d4f82227b270abc26be702)

 sys/kern/uipc_mbuf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Comment 11 Niels Bakker 2021-08-25 00:04:40 UTC
This seems to have been omitted from 13.0-p4.
Comment 12 John Baldwin freebsd_committer freebsd_triage 2021-08-25 17:22:50 UTC
Yes, we don't merge many things back as ENs.  Given that you probably aren't using kernel TLS in your setup, the workaround of setting kern.ipc.mb_use_ext_pgs to 0 should be sufficient for existing boxes running 13.0.