Created attachment 225814 [details] savecore With the following in /etc/rc.conf: ipfw_netflow_enable="YES" ipfw_netflow_rule=1111 ipfw_netflow_ip="192.168.1.2" ipfw_netflow_port=9995 ipfw_netflow_version=9 I can reliably get a kernel panic as soon as there is network traffic between a host on the LAN and a jail whose IPv4 address is attached to a VLAN interface which also has a bridge by vm-bhyve. 192.168.1.2 is another jail on the local machine. This is a kernel compiled with INVARIANTS but it also panics without, and did so on 13.0-RELEASE. #9 memmove_erms () at /usr/src/sys/amd64/amd64/support.S:547 #10 0xffffffff80c80f18 in m_dup (m=0xfffff8021e841200, m@entry=0xfffff801345d9d00, how=how@entry=1) at /usr/src/sys/kern/uipc_mbuf.c:722 #11 0xffffffff834ab3c1 in ng_ipfw_input (m0=<optimized out>, fwa=0xfffffe0144090300, tee=false) at /usr/src/sys/netgraph/ng_ipfw.c:324 #12 0xffffffff82936df0 in ipfw_check_packet (m0=0xfffffe01440904e8, ifp=0xfffff80004e5b800, flags=131072, ruleset=<optimized out>, inp=0xfffff8004cff35b8) at /usr/src/sys/netpfil/ipfw/ip_fw_pfil.c:297 #13 0xffffffff80d2a8c7 in pfil_run_hooks (head=<optimized out>, p=..., ifp=ifp@entry=0xfffff80004e5b800, flags=flags@entry=131072, inp=inp@entry=0xfffff8004cff35b8) at /usr/src/sys/net/pfil.c:187 Possibly related to Bug #256439.
Does it happen if you set sysctl kern.ipc.mb_use_ext_pgs=0?
(In reply to Mark Johnston from comment #1) Yes, that helped! No immediate crash when starting and accessing the problematic jail after this sysctl change.
John, Drew: should we simply modify m_dup() to handle unmapped mbufs and return a mapped chain?
I'm curious how this is using unmapped mbufs? Does ngtee use sendfile(2) under the hood? While we could patch m_dup(), I don't know we want to enforce the policy that the dup is always unmapped? That said, I think fixing m_dup is probably a single line change to replace the 'bcopy' with 'm_copydata' as is done in m_defrag(): diff --git a/sys/kern/uipc_mbuf.c b/sys/kern/uipc_mbuf.c index b9e716b411be..1a2098c7c536 100644 --- a/sys/kern/uipc_mbuf.c +++ b/sys/kern/uipc_mbuf.c @@ -719,7 +719,7 @@ m_dup(const struct mbuf *m, int how) while (n->m_len < nsize && m != NULL) { int chunk = min(nsize - n->m_len, m->m_len - moff); - bcopy(m->m_data + moff, n->m_data + n->m_len, chunk); + m_copydata(m, moff, chunk, n->m_data + n->m_len); moff += chunk; n->m_len += chunk; remain -= chunk;
(In reply to John Baldwin from comment #4) The full stack looks like this: #8 <signal handler called> #9 memmove_erms () at /usr/src/sys/amd64/amd64/support.S:547 #10 0xffffffff80c80f18 in m_dup (m=0xfffff8021e841200, m@entry=0xfffff801345d9d00, how=how@entry=1) at /usr/src/sys/kern/uipc_mbuf.c:722 #11 0xffffffff834ab3c1 in ng_ipfw_input (m0=<optimized out>, fwa=0xfffffe0144090300, tee=false) at /usr/src/sys/netgraph/ng_ipfw.c:324 #12 0xffffffff82936df0 in ipfw_check_packet (m0=0xfffffe01440904e8, ifp=0xfffff80004e5b800, flags=131072, ruleset=<optimized out>, inp=0xfffff8004cff35b8) at /usr/src/sys/netpfil/ipfw/ip_fw_pfil.c:297 #13 0xffffffff80d2a8c7 in pfil_run_hooks (head=<optimized out>, p=..., ifp=ifp@entry=0xfffff80004e5b800, flags=flags@entry=131072, inp=inp@entry=0xfffff8004cff35b8) at /usr/src/sys/net/pfil.c:187 #14 0xffffffff80d9f287 in ip_output_pfil (mp=0xfffffe01440904e8, ifp=0xfffff80004e5b800, flags=0, inp=0xfffff8004cff35b8, dst=0xfffff8004cff3760, fibnum=<optimized out>, error=<optimized out>) at /usr/src/sys/netinet/ip_output.c:130 #15 ip_output (m=m@entry=0xfffff801345d9d00, opt=<optimized out>, ro=<optimized out>, flags=<optimized out>, imo=imo@entry=0x0, inp=<optimized out>) at /usr/src/sys/netinet/ip_output.c:705 #16 0xffffffff80db8cab in tcp_output (tp=<optimized out>) at /usr/src/sys/netinet/tcp_output.c:1544 #17 0xffffffff80dccfff in tcp_usr_ready (so=<optimized out>, m=0xfffff802e78b3d00, count=<optimized out>) at /usr/src/sys/netinet/tcp_usrreq.c:1303 #18 0xffffffff80bef395 in sendfile_iodone (arg=arg@entry=0xfffff8035b9cdb00, pa=<optimized out>, pa@entry=0x0, count=<optimized out>, count@entry=0, error=<optimized out>) at /usr/src/sys/kern/kern_sendfile.c:399 #19 0xffffffff80beebc9 in vn_sendfile (fp=<optimized out>, sockfd=34, hdr_uio=0x0, trl_uio=0x0, offset=<optimized out>, nbytes=<optimized out>, sent=0xfffffe0144090a88, flags=0, td=0xfffffe0145810ac0) at /usr/src/sys/kern/kern_sendfile.c:1194 #20 0xffffffff80bef7c7 in fo_sendfile (fp=0xfffff8002b911034, sockfd=0, hdr_uio=0x7cc, trl_uio=0x7cc, offset=-8795362095052, nbytes=722828843999576199, sent=0xfffffe0144090a88, flags=730927156, td=0xfffffe0145810ac0) at /usr/src/sys/sys/file.h:411 #21 sendfile (td=0xfffffe0145810ac0, uap=0xfffffe0145810ea8, compat=<optimized out>) at /usr/src/sys/kern/kern_sendfile.c:1324 #22 0xffffffff81083ede in syscallenter (td=<optimized out>) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:189 > While we could patch m_dup(), I don't know we want to enforce the policy that the dup is always unmapped? Did you mean mapped? In any case, yeah, I was wondering if it's really a good idea to return a mapped chain. But if a consumer is m_dup()ing something then it probably intends to write to the dup anyway. m_copypacket() should be used otherwise.
Ok, I think I'm happy with m_dup always generating a mapped chain then. Presumably even if you did unmapped you'd still need to be allocating pages, etc. m_dup() already handles jumbo frames suboptimally (perhaps) by degrading them to a chain of 2k clusters, so duplicating an unmapped mbuf into a chain of mapped 2k clusters is consistent with that existing behavior.
Niels, are you able to test the patch from comment 4? (With kern.ipc.mb_use_ext_pgs set back to 1 of course.)
(In reply to Mark Johnston from comment #7) No crash with this patch applied while doing the same thing that used to trigger a panic.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=be79f30d6c3e353856d4f82227b270abc26be702 commit be79f30d6c3e353856d4f82227b270abc26be702 Author: John Baldwin <jhb@FreeBSD.org> AuthorDate: 2021-07-26 21:03:28 +0000 Commit: John Baldwin <jhb@FreeBSD.org> CommitDate: 2021-07-26 21:09:16 +0000 m_dup: Handle unmapped mbufs as an input mbuf. Use m_copydata() instead of a direct bcopy() when copying data out of a source mbuf into a newly-allocated mbuf. PR: 256610 Reported by: Niels Bakker <niels=freebsd@bakker.net> Reviewed by: markj MFC after: 2 weeks sys/kern/uipc_mbuf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=3e4aeeb95df217fdd9ec21e5a86ec61d1b2abc59 commit 3e4aeeb95df217fdd9ec21e5a86ec61d1b2abc59 Author: John Baldwin <jhb@FreeBSD.org> AuthorDate: 2021-07-26 21:03:28 +0000 Commit: John Baldwin <jhb@FreeBSD.org> CommitDate: 2021-08-11 19:18:46 +0000 m_dup: Handle unmapped mbufs as an input mbuf. Use m_copydata() instead of a direct bcopy() when copying data out of a source mbuf into a newly-allocated mbuf. PR: 256610 Reported by: Niels Bakker <niels=freebsd@bakker.net> Reviewed by: markj MFC after: 2 weeks (cherry picked from commit be79f30d6c3e353856d4f82227b270abc26be702) sys/kern/uipc_mbuf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
This seems to have been omitted from 13.0-p4.
Yes, we don't merge many things back as ENs. Given that you probably aren't using kernel TLS in your setup, the workaround of setting kern.ipc.mb_use_ext_pgs to 0 should be sufficient for existing boxes running 13.0.