Bug 253587 - nd6: m_nextpkt pointer not NULL'ed before sending
Summary: nd6: m_nextpkt pointer not NULL'ed before sending
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.0-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-net (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-02-17 14:18 UTC by Kamigishi Rei
Modified: 2021-04-18 15:16 UTC (History)
5 users (show)

See Also:
koobs: maintainer-feedback? (kbowling)
koobs: maintainer-feedback? (rrs)
koobs: maintainer-feedback? (melifaro)
melifaro: mfc-stable13+
melifaro: mfc-stable12+
melifaro: mfc-stable11+


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kamigishi Rei 2021-02-17 14:18:56 UTC
Seems to affect the ip6 flow. Happened twice so far over about 16 hours.

FreeBSD 13.0-BETA2 amd64 on a PCEngines apu4d4; both GENERIC and custom kernel configurations (with pf built in) are affected. The NICs are Intel i211-AT, default hardware offload settings.

Kernel panic message:

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0x18
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80c9aaf0
stack pointer           = 0x28:0xfffffe0007f8b3b0
frame pointer           = 0x28:0xfffffe0007f8b420
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (if_io_tqg_1)
trap number             = 12
panic: page fault
cpuid = 1
time = 1613563924
KDB: stack backtrace:
#0 0xffffffff80c56695 at kdb_backtrace+0x65
#1 0xffffffff80c09261 at vpanic+0x181
#2 0xffffffff80c090d3 at panic+0x43
#3 0xffffffff810891a7 at trap_fatal+0x387
#4 0xffffffff810891ff at trap_pfault+0x4f
#5 0xffffffff8108885d at trap+0x27d
#6 0xffffffff8105fc38 at calltrap+0x8
#7 0xffffffff82945494 at pf_pull_hdr+0x134
#8 0xffffffff8294f23b at pf_test6+0x36b
#9 0xffffffff8295fc80 at pf_check6_out+0x40
#10 0xffffffff80d40f17 at pfil_run_hooks+0x97
#11 0xffffffff80dfbff7 at ip6_forward+0x3c7
#12 0xffffffff80dfd915 at ip6_input+0xbb5
#13 0xffffffff80d3e26a at netisr_dispatch_src+0xca
#14 0xffffffff80d22a28 at ether_demux+0x148
#15 0xffffffff80d23dac at ether_nh_input+0x34c
#16 0xffffffff80d3e26a at netisr_dispatch_src+0xca
#17 0xffffffff80d22e79 at ether_input+0x69

kgdb:

Backtrace:

(kgdb) bt
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:399
#2  0xffffffff807bb406 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:486
#3  0xffffffff807bb880 in vpanic (fmt=<optimized out>, ap=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:919
#4  0xffffffff807bb683 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:843
#5  0xffffffff80b7c1a7 in trap_fatal (frame=0xfffffe0007f4c2f0, eva=24) at /usr/src/sys/amd64/amd64/trap.c:915
#6  0xffffffff80b7c1ff in trap_pfault (frame=frame@entry=0xfffffe0007f4c2f0, usermode=false, signo=<optimized out>, signo@entry=0x0, ucode=<optimized out>, ucode@entry=0x0) at /usr/src/sys/amd64/amd64/trap.c:732
#7  0xffffffff80b7b85d in trap (frame=0xfffffe0007f4c2f0) at /usr/src/sys/amd64/amd64/trap.c:398
#8  <signal handler called>
#9  0xffffffff8084d0a0 in m_copydata (m=0x0, off=40, len=2, cp=cp@entry=0xfffffe0007f4c540 "") at /usr/src/sys/kern/uipc_mbuf.c:649
#10 0xffffffff809b3a24 in pf_pull_hdr (m=m@entry=0xfffff8005865ec00, off=off@entry=40, p=p@entry=0xfffffe0007f4c540, len=len@entry=2, actionp=actionp@entry=0x0, reasonp=reasonp@entry=0xfffffe0007f4c5b6, af=28 '\034') at /usr/src/sys/netpfil/pf/pf.c:5422
#11 0xffffffff809bd7cb in pf_test6 (dir=dir@entry=2, pflags=393216, ifp=<optimized out>, m0=<optimized out>, m0@entry=0xfffffe0007f4c6b8, inp=0x0) at /usr/src/sys/netpfil/pf/pf.c:6398
#12 0xffffffff809cbf60 in pf_check6_out (m=0xfffffe0007f4c6b8, ifp=0x28, flags=40, ruleset=<optimized out>, inp=0x0) at /usr/src/sys/netpfil/pf/pf_ioctl.c:4535
#13 0xffffffff808fe1b7 in pfil_run_hooks (head=<optimized out>, p=..., ifp=0xfffff800026d3800, flags=flags@entry=393216, inp=inp@entry=0x0) at /usr/src/sys/net/pfil.c:187
#14 0xffffffff80975177 in ip6_forward (m=0xfffff8005865ec00, srcrt=srcrt@entry=0) at /usr/src/sys/netinet6/ip6_forward.c:316
#15 0xffffffff80976a95 in ip6_input (m=0xfffff8005865ec00) at /usr/src/sys/netinet6/ip6_input.c:896
#16 0xffffffff808fb50a in netisr_dispatch_src (proto=6, source=<optimized out>, source@entry=0, m=0xfffffe0007f4c540) at /usr/src/sys/net/netisr.c:1143
#17 0xffffffff808fb7ff in netisr_dispatch (proto=1483074560, m=0x2) at /usr/src/sys/net/netisr.c:1234
#18 0xffffffff808dfcc8 in ether_demux (ifp=ifp@entry=0xfffff80002481800, m=0x28) at /usr/src/sys/net/if_ethersubr.c:923
#19 0xffffffff808e104c in ether_input_internal (ifp=0xfffff80002481800, m=0x28) at /usr/src/sys/net/if_ethersubr.c:709
#20 ether_nh_input (m=<optimized out>) at /usr/src/sys/net/if_ethersubr.c:739
#21 0xffffffff808fb50a in netisr_dispatch_src (proto=proto@entry=5, source=<optimized out>, source@entry=0, m=0xfffffe0007f4c540, m@entry=0xfffff8005865ec00) at /usr/src/sys/net/netisr.c:1143
#22 0xffffffff808fb7ff in netisr_dispatch (proto=1483074560, proto@entry=5, m=0x2, m@entry=0xfffff8005865ec00) at /usr/src/sys/net/netisr.c:1234
#23 0xffffffff808e0119 in ether_input (ifp=<optimized out>, m=0xfffff8005865ec00) at /usr/src/sys/net/if_ethersubr.c:830
#24 0xffffffff808f7c48 in iflib_rxeof (rxq=<optimized out>, rxq@entry=0xfffff80002481000, budget=<optimized out>) at /usr/src/sys/net/iflib.c:3008
#25 0xffffffff808f1fa2 in _task_fn_rx (context=0xfffff80002481000) at /usr/src/sys/net/iflib.c:3951
#26 0xffffffff808076ad in gtaskqueue_run_locked (queue=queue@entry=0xfffff80002424700) at /usr/src/sys/kern/subr_gtaskqueue.c:371
#27 0xffffffff8080734c in gtaskqueue_thread_loop (arg=<optimized out>, arg@entry=0xfffffe0008d54008) at /usr/src/sys/kern/subr_gtaskqueue.c:547
#28 0xffffffff8077990e in fork_exit (callout=0xffffffff808072a0 <gtaskqueue_thread_loop>, arg=0xfffffe0008d54008, frame=0xfffffe0007f4cc00) at /usr/src/sys/kern/kern_fork.c:1069
#29 <signal handler called>

Frames:

(kgdb) f 10
#10 0xffffffff809b3a24 in pf_pull_hdr (m=m@entry=0xfffff8005865ec00, off=off@entry=40, p=p@entry=0xfffffe0007f4c540,
    len=len@entry=2, actionp=actionp@entry=0x0, reasonp=reasonp@entry=0xfffffe0007f4c5b6, af=28 '\034')
    at /usr/src/sys/netpfil/pf/pf.c:5422
5422            m_copydata(m, off, len, p);
(kgdb) print m
$3 = (struct mbuf *) 0xfffff8005865ec00

(kgdb) f 9
#9  0xffffffff8084d0a0 in m_copydata (m=0x0, off=40, len=2, cp=cp@entry=0xfffffe0007f4c540 "")
    at /usr/src/sys/kern/uipc_mbuf.c:649
649                     if (off < m->m_len)
(kgdb) print m
$4 = (const struct mbuf *) 0x0

m in frame 10:

(kgdb) print *m
$1 = {{m_next = 0x0, m_slist = {sle_next = 0x0}, m_stailq = {stqe_next = 0x0}}, {m_nextpkt = 0x0, m_slistpkt = {
      sle_next = 0x0}, m_stailqpkt = {stqe_next = 0x0}}, m_data = 0xfffff8005865ec58 "\001", m_len = 0, m_type = 1,
  m_flags = 2, {{{m_pkthdr = {{snd_tag = 0x0, rcvif = 0x0}, tags = {slh_first = 0x0}, len = 1232, flowid = 0,
          csum_flags = 0, fibnum = 0, numa_domain = 255 '\377', rsstype = 0 '\000', {rcv_tstmp = 0, {
              l2hlen = 0 '\000', l3hlen = 0 '\000', l4hlen = 0 '\000', l5hlen = 0 '\000', inner_l2hlen = 0 '\000',
              inner_l3hlen = 0 '\000', inner_l4hlen = 0 '\000', inner_l5hlen = 0 '\000'}}, PH_per = {
            eight = "\000\000\000\000\000\000\000", sixteen = {0, 0, 0, 0}, thirtytwo = {0, 0}, sixtyfour = {0},
            unintptr = {0}, ptr = 0x0}, PH_loc = {eight = "\000\000\000\000\000\000\000", sixteen = {0, 0, 0, 0},
            thirtytwo = {0, 0}, sixtyfour = {0}, unintptr = {0}, ptr = 0x0}}, {m_epg_npgs = 0 '\000',
          m_epg_nrdy = 0 '\000', m_epg_hdrlen = 0 '\000', m_epg_trllen = 0 '\000', m_epg_1st_off = 0,
          m_epg_last_len = 0, m_epg_flags = 0 '\000', m_epg_record_type = 0 '\000', __spare = "\000",
          m_epg_enc_cnt = 0, m_epg_tls = 0x4d0, m_epg_so = 0xff000000000000, m_epg_seqno = 0, m_epg_stailq = {
            stqe_next = 0x0}}}, {m_ext = {{ext_count = 1, ext_cnt = 0xd00125500000001}, ext_size = 4096, ext_type = 3,
          ext_flags = 1, {{ext_buf = 0xfffff8012b419000 "", ext_arg2 = 0x0}, {extpg_pa = {18446735282637213696, 0,
                372221068050365953, 5427120254332600373, 13475210667545916651},
              extpg_trail = "\303y\262a\265\272\361\362Q\346P\020\000\246\a\325\000\000\060\060\061/default,2018,-1\000MM_CHARSET=UTF-8\000BLOCKSIZE", extpg_hdr = "=K\000SHLVL=1\000\000\000c\354\360\000\000\000\000\002\000"}},
          ext_free = 0x0, ext_arg1 = 0x0}, m_pktdat = 0xfffff8005865ec58 "\001"}}, m_dat = 0xfffff8005865ec20 ""}}
Comment 1 Kristof Provost freebsd_committer freebsd_triage 2021-02-17 20:07:05 UTC
I am very confused by this panic.

I initially thought that pf didn't account for mbufs with external storage, but the network stack should already have fixed that for us.
The mbuf has m_len == 0, which shouldn't be possible. The network stack would have done m_pullup(), and has in any event has done mtod(m, struct ip6_hdr) without issue.
Additionally, the pf_test6() function has also done this without issue.

It's also not clear how pf_pull_hdr() can get a valid m pointer and yet call m_copydata() with a NULL mbuf pointer.
Comment 2 Kristof Provost freebsd_committer freebsd_triage 2021-02-17 20:27:57 UTC
Best theory I have so far is that we're getting an invalid mbuf (chain) from the driver. That is, our initial mbuf contains valid header information, but has m_len set to 0. When we try to m_copydata() we try to find the first mbuf in the chain that contains byte 'off', which makes us run straight off the end of the mbuf chain and panic.

That'd likely make it a driver issue rather than a pf problem.
Comment 3 Kamigishi Rei 2021-02-18 08:42:18 UTC
It does not seem like pf specifically is at fault here. Got two more faults over the past 12 hours and both were with mbufs being 0x0 in different code paths:

Important note: 

net.isr.maxthreads: -1
net.isr.bindthreads: 1

The CPU is a quad core AMD GX-412TC SoC.

I will now test with these set to defaults (1 and 0, correspondingly).

net.isr.dispatch is "direct" and was not touched.

#1:

(kgdb) bt
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:399
#2  0xffffffff80c08e56 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:486
#3  0xffffffff80c092d0 in vpanic (fmt=<optimized out>, ap=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:919
#4  0xffffffff80c090d3 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:843
#5  0xffffffff810891a7 in trap_fatal (frame=0xfffffe0007f86710, eva=28) at /usr/src/sys/amd64/amd64/trap.c:915
#6  0xffffffff810891ff in trap_pfault (frame=frame@entry=0xfffffe0007f86710, usermode=false, signo=<optimized out>, signo@entry=0x0, ucode=<optimized out>, ucode@entry=0x0)
    at /usr/src/sys/amd64/amd64/trap.c:732
#7  0xffffffff8108885d in trap (frame=0xfffffe0007f86710) at /usr/src/sys/amd64/amd64/trap.c:398
#8  <signal handler called>
#9  0xffffffff80c9ac8a in m_dup (m=0x0, m@entry=0xfffff80119f1b800, how=<optimized out>, how@entry=1) at /usr/src/sys/kern/uipc_mbuf.c:686
#10 0xffffffff8297e8e8 in bridge_input (ifp=0xfffff800036d1800, m=0xfffff80119f1b800) at /usr/src/sys/net/if_bridge.c:2415
#11 0xffffffff80d23c78 in ether_input_internal (ifp=0xfffff800036d1800, m=0xfffff80104ebf100) at /usr/src/sys/net/if_ethersubr.c:673
#12 ether_nh_input (m=<optimized out>) at /usr/src/sys/net/if_ethersubr.c:739
#13 0xffffffff80d3e26a in netisr_dispatch_src (proto=proto@entry=5, source=<optimized out>, source@entry=0, m=0xfffff80104ebf100, m@entry=0xfffff80119f1b800)
    at /usr/src/sys/net/netisr.c:1143
#14 0xffffffff80d3e55f in netisr_dispatch (proto=83019968, proto@entry=5, m=0x1, m@entry=0xfffff80119f1b800) at /usr/src/sys/net/netisr.c:1234
#15 0xffffffff80d22e79 in ether_input (ifp=<optimized out>, m=0xfffff80119f1b800) at /usr/src/sys/net/if_ethersubr.c:830
#16 0xffffffff80d3a9a8 in iflib_rxeof (rxq=<optimized out>, rxq@entry=0xfffff800036d1000, budget=<optimized out>) at /usr/src/sys/net/iflib.c:3008
#17 0xffffffff80d34d02 in _task_fn_rx (context=0xfffff800036d1000) at /usr/src/sys/net/iflib.c:3951
#18 0xffffffff80c550fd in gtaskqueue_run_locked (queue=queue@entry=0xfffff8000342ea00) at /usr/src/sys/kern/subr_gtaskqueue.c:371
#19 0xffffffff80c54d9c in gtaskqueue_thread_loop (arg=<optimized out>, arg@entry=0xfffffe0008d4f038) at /usr/src/sys/kern/subr_gtaskqueue.c:547
#20 0xffffffff80bc735e in fork_exit (callout=0xffffffff80c54cf0 <gtaskqueue_thread_loop>, arg=0xfffffe0008d4f038, frame=0xfffffe0007f86c00) at /usr/src/sys/kern/kern_fork.c:1069
#21 <signal handler called>


#2:

Here m_nextpkt is 0x0, len is 1307, and m_nextpkt is assigned to next and gets dereferenced:

(kgdb) bt
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:399
#2  0xffffffff80c08e56 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:486
#3  0xffffffff80c092d0 in vpanic (fmt=<optimized out>, ap=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:919
#4  0xffffffff80c090d3 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:843
#5  0xffffffff810891a7 in trap_fatal (frame=0xfffffe0062c6e700, eva=8) at /usr/src/sys/amd64/amd64/trap.c:915
#6  0xffffffff810891ff in trap_pfault (frame=frame@entry=0xfffffe0062c6e700, usermode=false, signo=<optimized out>, signo@entry=0x0, ucode=<optimized out>, ucode@entry=0x0)
    at /usr/src/sys/amd64/amd64/trap.c:732
#7  0xffffffff8108885d in trap (frame=0xfffffe0062c6e700) at /usr/src/sys/amd64/amd64/trap.c:398
#8  <signal handler called>
#9  sbcut_internal (sb=0xfffff800a75649c0, len=1307, len@entry=1475) at /usr/src/sys/kern/uipc_sockbuf.c:1491
#10 0xffffffff80ca4eca in sbcut_locked (sb=0xfffff800a75649c0, len=-1390745600, len@entry=1475) at /usr/src/sys/kern/uipc_sockbuf.c:1591
#11 0xffffffff80dbda2e in tcp_do_segment (m=0xfffff80042a5d800, th=<optimized out>, so=<optimized out>, tp=<optimized out>, drop_hdrlen=52, tlen=<optimized out>, iptos=0 '\000')
    at /usr/src/sys/netinet/tcp_input.c:2924
#12 0xffffffff80dbbb9e in tcp_input (mp=<optimized out>, offp=<optimized out>, proto=<optimized out>) at /usr/src/sys/netinet/tcp_input.c:1381
#13 0xffffffff80dae555 in ip_input (m=0x0) at /usr/src/sys/netinet/ip_input.c:833
#14 0xffffffff80d3ea0b in netisr_process_workstream_proto (nwsp=<optimized out>, proto=1) at /usr/src/sys/net/netisr.c:919
#15 swi_net (arg=<optimized out>) at /usr/src/sys/net/netisr.c:966
#16 0xffffffff80bca53d in intr_event_execute_handlers (p=<optimized out>, ie=0xfffff80003418d00) at /usr/src/sys/kern/kern_intr.c:1168
#17 ithread_execute_handlers (p=<optimized out>, ie=0xfffff80003418d00) at /usr/src/sys/kern/kern_intr.c:1181
#18 ithread_loop (arg=arg@entry=0xfffff8000341ed60) at /usr/src/sys/kern/kern_intr.c:1269
#19 0xffffffff80bc735e in fork_exit (callout=0xffffffff80bca2f0 <ithread_loop>, arg=0xfffff8000341ed60, frame=0xfffffe0062c6ec00) at /usr/src/sys/kern/kern_fork.c:1069
#20 <signal handler called>
Comment 4 Kamigishi Rei 2021-02-21 18:03:11 UTC
Update: this happens with maxthreads=1 as well. Does not happen inside a VM. With an INVARIANTS kernel I can reproduce this reliably by initiating a zfs send over SSH through this host acting as a router.
Comment 5 Kamigishi Rei 2021-02-21 18:06:27 UTC
Update: this happens with maxthreads=1 as well. Does not happen inside a VM.

With an INVARIANTS kernel I can reproduce this reliably by initiating a zfs send over SSH through this host acting as a router (4 crashes out of 4 send attempts). Out of these 4 crashes, three were the same KASSERT:

panic: Assertion m->m_nextpkt == NULL failed at /usr/src/sys/net/iflib.c:3638
cpuid = 2
time = 1613930234
KDB: stack backtrace:
#0 0xffffffff807fcfe5 at kdb_backtrace+0x65
#1 0xffffffff807b2cd1 at vpanic+0x181
#2 0xffffffff807b2aa3 at panic+0x43
#3 0xffffffff808ec3a1 at iflib_completed_tx_reclaim+0x2d1
#4 0xffffffff808eb780 at iflib_txq_drain+0x60
#5 0xffffffff808f2dfe at drain_ring_lockless+0x9e
#6 0xffffffff808f2b93 at ifmp_ring_enqueue+0x313
#7 0xffffffff808f1520 at iflib_if_transmit+0xa0
#8 0xffffffff808d0418 at bridge_enqueue+0xc8
#9 0xffffffff808d26c4 at bridge_output+0x134
#10 0xffffffff808d73af at ether_output+0x63f
#11 0xffffffff8097480b at ip6_forward+0x95b
#12 0xffffffff80976084 at ip6_input+0xf04
#13 0xffffffff808f4491 at netisr_dispatch_src+0xb1
#14 0xffffffff808d76be at ether_demux+0x17e
#15 0xffffffff808d8d4c at ether_nh_input+0x40c
#16 0xffffffff808f4491 at netisr_dispatch_src+0xb1
#17 0xffffffff808d7bb1 at ether_input+0xa1
Uptime: 1m36s
Dumping 402 out of 4051 MB:..4%..12%..24%..32%..44%..52%..64%..72%..84%..92%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) bt
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:399
#2  0xffffffff807b28fb in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:486
#3  0xffffffff807b2d40 in vpanic (fmt=<optimized out>, ap=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:919
#4  0xffffffff807b2aa3 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:843
#5  0xffffffff808ec3a1 in iflib_tx_desc_free (txq=<optimized out>, n=<optimized out>) at /usr/src/sys/net/iflib.c:3638
#6  iflib_completed_tx_reclaim (txq=<optimized out>, txq@entry=0xfffffe0063088000, thresh=<optimized out>) at /usr/src/sys/net/iflib.c:3680
#7  0xffffffff808eb780 in iflib_txq_drain (r=0xfffffe0063094000, r@entry=<error reading variable: value is not available>, cidx=718, cidx@entry=<error reading variable: value is not available>, pidx=719,
    pidx@entry=<error reading variable: value is not available>) at /usr/src/sys/net/iflib.c:3744
#8  0xffffffff808f2dfe in drain_ring_lockless (r=<optimized out>, os=..., prev=0, budget=<optimized out>) at /usr/src/sys/net/mp_ring.c:187
#9  0xffffffff808f2b93 in ifmp_ring_enqueue (r=0xfffffe0063094000, items=<optimized out>, items@entry=0xfffffe0007f924e8, n=<optimized out>, n@entry=1, budget=<optimized out>, budget@entry=32, abdicate=<optimized out>,
    abdicate@entry=0) at /usr/src/sys/net/mp_ring.c:470
#10 0xffffffff808f1520 in iflib_if_transmit (ifp=<optimized out>, m=0xfffff80015f48000) at /usr/src/sys/net/iflib.c:4135
#11 0xffffffff808d0418 in bridge_enqueue (sc=sc@entry=0xfffff80015aa0c00, dst_ifp=dst_ifp@entry=0xfffff80002647800, m=<unavailable>, m@entry=0xfffff80015f48000) at /usr/src/sys/net/if_bridge.c:1983
#12 0xffffffff808d26c4 in bridge_output (ifp=<optimized out>, ifp@entry=<error reading variable: value is not available>, m=0xfffff80015f48000, m@entry=<error reading variable: value is not available>, sa=<unavailable>,
    sa@entry=<error reading variable: value is not available>, rt=<unavailable>, rt@entry=<error reading variable: value is not available>) at /usr/src/sys/net/if_bridge.c:2145
#13 0xffffffff808d73af in ether_output (ifp=0xfffff80002647800, m=<unavailable>, dst=0xfffffe0007f92670, ro=<optimized out>) at /usr/src/sys/net/if_ethersubr.c:414
#14 0xffffffff8097480b in ip6_forward (m=<unavailable>, srcrt=srcrt@entry=0) at /usr/src/sys/netinet6/ip6_forward.c:387
#15 0xffffffff80976084 in ip6_input (m=<unavailable>, m@entry=<error reading variable: value is not available>) at /usr/src/sys/netinet6/ip6_input.c:896
#16 0xffffffff808f4491 in netisr_dispatch_src (proto=6, source=source@entry=0, m=0xfffff80023e49900) at /usr/src/sys/net/netisr.c:1143
#17 0xffffffff808f47df in netisr_dispatch (proto=<unavailable>, m=<unavailable>) at /usr/src/sys/net/netisr.c:1234
#18 0xffffffff808d76be in ether_demux (ifp=ifp@entry=0xfffff800026cb800, m=<unavailable>) at /usr/src/sys/net/if_ethersubr.c:923
#19 0xffffffff808d8d4c in ether_input_internal (ifp=0xfffff800026cb800, m=<unavailable>) at /usr/src/sys/net/if_ethersubr.c:709
#20 ether_nh_input (m=<optimized out>, m@entry=<error reading variable: value is not available>) at /usr/src/sys/net/if_ethersubr.c:739
#21 0xffffffff808f4491 in netisr_dispatch_src (proto=proto@entry=5, source=source@entry=0, m=m@entry=0xfffff80023e49900) at /usr/src/sys/net/netisr.c:1143
#22 0xffffffff808f47df in netisr_dispatch (proto=<unavailable>, proto@entry=5, m=<unavailable>, m@entry=0xfffff80023e49900) at /usr/src/sys/net/netisr.c:1234
#23 0xffffffff808d7bb1 in ether_input (ifp=0xfffff800026cb800, m=0xfffff80023e49900) at /usr/src/sys/net/if_ethersubr.c:830
#24 0xffffffff808f0556 in iflib_rxeof (rxq=<optimized out>, rxq@entry=0xfffff800026cb000, budget=<optimized out>) at /usr/src/sys/net/iflib.c:3008
#25 0xffffffff808ea0ca in _task_fn_rx (context=0xfffff800026cb000) at /usr/src/sys/net/iflib.c:3951
#26 0xffffffff807fb977 in gtaskqueue_run_locked (queue=queue@entry=0xfffff80002423300) at /usr/src/sys/kern/subr_gtaskqueue.c:371
#27 0xffffffff807fb774 in gtaskqueue_thread_loop (arg=arg@entry=0xfffffe0008d54038) at /usr/src/sys/kern/subr_gtaskqueue.c:547
#28 0xffffffff8076efb0 in fork_exit (callout=0xffffffff807fb6e0 <gtaskqueue_thread_loop>, arg=0xfffffe0008d54038, frame=0xfffffe0007f92c00) at /usr/src/sys/kern/kern_fork.c:1069
#29 <signal handler called>


4th crash:

panic: m_dup: no mbuf packet header!
cpuid = 1
time = 1613919472
KDB: stack backtrace:
#0 0xffffffff807fcfe5 at kdb_backtrace+0x65
#1 0xffffffff807b2cd1 at vpanic+0x181
#2 0xffffffff807b2aa3 at panic+0x43
#3 0xffffffff80842981 at m_dup+0x351
#4 0xffffffff808ec610 at iflib_encap+0x210
#5 0xffffffff808ebb39 at iflib_txq_drain+0x419
#6 0xffffffff808f2dfe at drain_ring_lockless+0x9e
#7 0xffffffff808f2b93 at ifmp_ring_enqueue+0x313
#8 0xffffffff808f1520 at iflib_if_transmit+0xa0
#9 0xffffffff808d0418 at bridge_enqueue+0xc8
#10 0xffffffff808d26c4 at bridge_output+0x134
#11 0xffffffff808d73af at ether_output+0x63f
#12 0xffffffff8097480b at ip6_forward+0x95b
#13 0xffffffff80976084 at ip6_input+0xf04
#14 0xffffffff808f4491 at netisr_dispatch_src+0xb1
#15 0xffffffff808d76be at ether_demux+0x17e
#16 0xffffffff808d8d4c at ether_nh_input+0x40c
#17 0xffffffff808f4491 at netisr_dispatch_src+0xb1
Uptime: 3m59s
Dumping 409 out of 4051 MB:..4%..12%..24%..32%..43%..51%..63%..71%..83%..94%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) bt
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:399
#2  0xffffffff807b28fb in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:486
#3  0xffffffff807b2d40 in vpanic (fmt=<optimized out>, ap=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:919
#4  0xffffffff807b2aa3 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:843
#5  0xffffffff80842981 in m_dup (m=<optimized out>, how=1) at /usr/src/sys/kern/uipc_mbuf.c:733
#6  0xffffffff808ec610 in iflib_parse_header (txq=0xfffffe006302ea40, pi=0xfffffe0007f47338, mp=0xfffffe006304f7f8) at /usr/src/sys/net/iflib.c:3138
#7  iflib_encap (txq=txq@entry=0xfffffe006302ea40, m_headp=m_headp@entry=0xfffffe006304f7f8) at /usr/src/sys/net/iflib.c:3464
#8  0xffffffff808ebb39 in iflib_txq_drain (r=<optimized out>, r@entry=<error reading variable: value is not available>, cidx=<optimized out>, cidx@entry=<error reading variable: value is not available>, pidx=0,
    pidx@entry=<error reading variable: value is not available>) at /usr/src/sys/net/iflib.c:3801
#9  0xffffffff808f2dfe in drain_ring_lockless (r=<optimized out>, os=..., prev=0, budget=<optimized out>) at /usr/src/sys/net/mp_ring.c:187
#10 0xffffffff808f2b93 in ifmp_ring_enqueue (r=0xfffffe006304c000, items=<optimized out>, items@entry=0xfffffe0007f474e8, n=<optimized out>, n@entry=1, budget=<optimized out>, budget@entry=32, abdicate=<optimized out>,
    abdicate@entry=0) at /usr/src/sys/net/mp_ring.c:470
#11 0xffffffff808f1520 in iflib_if_transmit (ifp=<optimized out>, m=0xfffff800586f9000) at /usr/src/sys/net/iflib.c:4135
#12 0xffffffff808d0418 in bridge_enqueue (sc=sc@entry=0xfffff80016b54c00, dst_ifp=dst_ifp@entry=0xfffff80002456800, m=<unavailable>, m@entry=0xfffff800586f9000) at /usr/src/sys/net/if_bridge.c:1983
#13 0xffffffff808d26c4 in bridge_output (ifp=<optimized out>, ifp@entry=<error reading variable: value is not available>, m=0xfffff800586f9000, m@entry=<error reading variable: value is not available>, sa=<unavailable>,
    sa@entry=<error reading variable: value is not available>, rt=<unavailable>, rt@entry=<error reading variable: value is not available>) at /usr/src/sys/net/if_bridge.c:2145
#14 0xffffffff808d73af in ether_output (ifp=0xfffff80002456800, m=<unavailable>, dst=0xfffffe0007f47670, ro=<optimized out>) at /usr/src/sys/net/if_ethersubr.c:414
#15 0xffffffff8097480b in ip6_forward (m=<unavailable>, srcrt=srcrt@entry=0) at /usr/src/sys/netinet6/ip6_forward.c:387
#16 0xffffffff80976084 in ip6_input (m=<unavailable>, m@entry=<error reading variable: value is not available>) at /usr/src/sys/netinet6/ip6_input.c:896
#17 0xffffffff808f4491 in netisr_dispatch_src (proto=6, source=source@entry=0, m=0xfffff80016ed7600) at /usr/src/sys/net/netisr.c:1143
#18 0xffffffff808f47df in netisr_dispatch (proto=<unavailable>, m=<unavailable>) at /usr/src/sys/net/netisr.c:1234
#19 0xffffffff808d76be in ether_demux (ifp=ifp@entry=0xfffff80002480800, m=<unavailable>) at /usr/src/sys/net/if_ethersubr.c:923
#20 0xffffffff808d8d4c in ether_input_internal (ifp=0xfffff80002480800, m=<unavailable>) at /usr/src/sys/net/if_ethersubr.c:709
#21 ether_nh_input (m=<optimized out>, m@entry=<error reading variable: value is not available>) at /usr/src/sys/net/if_ethersubr.c:739
#22 0xffffffff808f4491 in netisr_dispatch_src (proto=proto@entry=5, source=source@entry=0, m=m@entry=0xfffff80016ed7600) at /usr/src/sys/net/netisr.c:1143
#23 0xffffffff808f47df in netisr_dispatch (proto=<unavailable>, proto@entry=5, m=<unavailable>, m@entry=0xfffff80016ed7600) at /usr/src/sys/net/netisr.c:1234
#24 0xffffffff808d7bb1 in ether_input (ifp=0xfffff80002480800, m=0xfffff80016ed7600) at /usr/src/sys/net/if_ethersubr.c:830
#25 0xffffffff808f0556 in iflib_rxeof (rxq=<optimized out>, rxq@entry=0xfffff80002480300, budget=<optimized out>) at /usr/src/sys/net/iflib.c:3008
#26 0xffffffff808ea0ca in _task_fn_rx (context=0xfffff80002480300) at /usr/src/sys/net/iflib.c:3951
#27 0xffffffff807fb977 in gtaskqueue_run_locked (queue=queue@entry=0xfffff80002422500) at /usr/src/sys/kern/subr_gtaskqueue.c:371
#28 0xffffffff807fb774 in gtaskqueue_thread_loop (arg=arg@entry=0xfffffe0008d54020) at /usr/src/sys/kern/subr_gtaskqueue.c:547
#29 0xffffffff8076efb0 in fork_exit (callout=0xffffffff807fb6e0 <gtaskqueue_thread_loop>, arg=0xfffffe0008d54020, frame=0xfffffe0007f47c00) at /usr/src/sys/kern/kern_fork.c:1069
#30 <signal handler called>
Comment 6 Kamigishi Rei 2021-02-21 19:46:53 UTC
With pf and ipfw inactive and without a bridge present (pf and if_bridge are compiled into the kernel), IPv6 traffic from outside via igb0 to a LAN host via igb1:

Unread portion of the kernel message buffer:
panic: Assertion m->m_nextpkt == NULL failed at /usr/src/sys/net/iflib.c:4089
cpuid = 3
time = 1613936205
KDB: stack backtrace:
#0 0xffffffff807fcfe5 at kdb_backtrace+0x65
#1 0xffffffff807b2cd1 at vpanic+0x181
#2 0xffffffff807b2aa3 at panic+0x43
#3 0xffffffff808f15db at iflib_if_transmit+0x15b
#4 0xffffffff808d751b at ether_output_frame+0xab
#5 0xffffffff808d7421 at ether_output+0x6b1
#6 0xffffffff80984025 at nd6_flush_holdchain+0x35
#7 0xffffffff80987950 at nd6_na_input+0x5a0
#8 0xffffffff8095cc0e at icmp6_input+0xb3e
#9 0xffffffff80976009 at ip6_input+0xe89
#10 0xffffffff808f4491 at netisr_dispatch_src+0xb1
#11 0xffffffff808d76be at ether_demux+0x17e
#12 0xffffffff808d8d4c at ether_nh_input+0x40c
#13 0xffffffff808f4491 at netisr_dispatch_src+0xb1
#14 0xffffffff808d7bb1 at ether_input+0xa1
#15 0xffffffff808f0556 at iflib_rxeof+0xe06
#16 0xffffffff808ea0ca at _task_fn_rx+0x7a
#17 0xffffffff807fb977 at gtaskqueue_run_locked+0xa7
Uptime: 1m8s
Dumping 357 out of 4051 MB:..5%..14%..23%..32%..41%..54%..63%..72%..81%..95%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) bt
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:399
#2  0xffffffff807b28fb in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:486
#3  0xffffffff807b2d40 in vpanic (fmt=<optimized out>, ap=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:919
#4  0xffffffff807b2aa3 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:843
#5  0xffffffff808f15db in iflib_if_transmit (ifp=0xfffff800026ac800, m=0xfffff800237adc00) at /usr/src/sys/net/iflib.c:4089
#6  0xffffffff808d751b in ether_output_frame (ifp=ifp@entry=0xfffff800026ac800, m=<unavailable>) at /usr/src/sys/net/if_ethersubr.c:511
#7  0xffffffff808d7421 in ether_output (ifp=<optimized out>, m=<unavailable>, dst=0xfffffe0007f8d408, ro=<optimized out>) at /usr/src/sys/net/if_ethersubr.c:438
#8  0xffffffff80984025 in nd6_flush_holdchain (ifp=ifp@entry=0xfffff800026ac800, chain=<optimized out>, dst=dst@entry=0xfffffe0007f8d408) at /usr/src/sys/netinet6/nd6.c:2463
#9  0xffffffff80987950 in nd6_na_input (m=m@entry=0xfffff800232ee800, off=<optimized out>, off@entry=40, icmp6len=<optimized out>, icmp6len@entry=32) at /usr/src/sys/netinet6/nd6_nbr.c:909
#10 0xffffffff8095cc0e in icmp6_input (mp=0xfffffe0007f8d778, mp@entry=<error reading variable: value is not available>, offp=0xfffffe0007f8d770,
    offp@entry=<error reading variable: value is not available>, proto=<unavailable>, proto@entry=<error reading variable: value is not available>) at /usr/src/sys/netinet6/icmp6.c:817
#11 0xffffffff80976009 in ip6_input (m=0xfffff800232ee800, m@entry=<error reading variable: value is not available>) at /usr/src/sys/netinet6/ip6_input.c:930
#12 0xffffffff808f4491 in netisr_dispatch_src (proto=6, source=source@entry=0, m=0xfffff800232ee800) at /usr/src/sys/net/netisr.c:1143
#13 0xffffffff808f47df in netisr_dispatch (proto=<unavailable>, m=<unavailable>) at /usr/src/sys/net/netisr.c:1234
#14 0xffffffff808d76be in ether_demux (ifp=ifp@entry=0xfffff800026ac800, m=<unavailable>) at /usr/src/sys/net/if_ethersubr.c:923
#15 0xffffffff808d8d4c in ether_input_internal (ifp=0xfffff800026ac800, m=<unavailable>) at /usr/src/sys/net/if_ethersubr.c:709
#16 ether_nh_input (m=<optimized out>, m@entry=<error reading variable: value is not available>) at /usr/src/sys/net/if_ethersubr.c:739
#17 0xffffffff808f4491 in netisr_dispatch_src (proto=proto@entry=5, source=source@entry=0, m=m@entry=0xfffff800232ee800) at /usr/src/sys/net/netisr.c:1143
#18 0xffffffff808f47df in netisr_dispatch (proto=<unavailable>, proto@entry=5, m=<unavailable>, m@entry=0xfffff800232ee800) at /usr/src/sys/net/netisr.c:1234
#19 0xffffffff808d7bb1 in ether_input (ifp=0xfffff800026ac800, m=0xfffff800232ee800) at /usr/src/sys/net/if_ethersubr.c:830
#20 0xffffffff808f0556 in iflib_rxeof (rxq=<optimized out>, rxq@entry=0xfffff800026ac300, budget=<optimized out>) at /usr/src/sys/net/iflib.c:3008
#21 0xffffffff808ea0ca in _task_fn_rx (context=0xfffff800026ac300) at /usr/src/sys/net/iflib.c:3951
#22 0xffffffff807fb977 in gtaskqueue_run_locked (queue=queue@entry=0xfffff80002423100) at /usr/src/sys/kern/subr_gtaskqueue.c:371
#23 0xffffffff807fb774 in gtaskqueue_thread_loop (arg=arg@entry=0xfffffe0008d54050) at /usr/src/sys/kern/subr_gtaskqueue.c:547
#24 0xffffffff8076efb0 in fork_exit (callout=0xffffffff807fb6e0 <gtaskqueue_thread_loop>, arg=0xfffffe0008d54050, frame=0xfffffe0007f8dc00) at /usr/src/sys/kern/kern_fork.c:1069
#25 <signal handler called>
Comment 7 Alexander V. Chernikov freebsd_committer freebsd_triage 2021-02-21 20:29:06 UTC
(In reply to Kamigishi Rei from comment #6)
the last one may actually have a different cause: https://github.com/freebsd/freebsd-src/commit/b3cfe07d74a9ee4b726e2333ff327d154181572d was committed slightly after -BETA2.

Would it be possible if you could try updating to BETA3 and see if it becomes better?
Comment 8 Kamigishi Rei 2021-02-21 22:06:43 UTC
(In reply to Alexander V. Chernikov from comment #7)

Thank you for the suggestion; it does not seem to crash on my test any more. Will see how it goes for a week and close this if everything stays fine. Almost all of the crashes I witnessed were due to mbufs.
Comment 9 Kevin Bowling freebsd_committer freebsd_triage 2021-04-16 20:17:52 UTC
(In reply to Kamigishi Rei from comment #8)
Thank you for the report and confirmation of the fix.  Have the later 13.0 distributions remained stable?
Comment 10 Kevin Bowling freebsd_committer freebsd_triage 2021-04-16 20:18:21 UTC
Did not mean to close this yet, reopening for confirmation from reporter.
Comment 11 Kamigishi Rei 2021-04-16 20:19:09 UTC
(In reply to Kevin Bowling from comment #9)
Have not had a single panic since, thank you.
Comment 12 Kevin Bowling freebsd_committer freebsd_triage 2021-04-16 20:41:44 UTC
(In reply to Kamigishi Rei from comment #11)
Comment 13 Kubilay Kocak freebsd_committer freebsd_triage 2021-04-18 01:18:12 UTC
^Triage:

Do we have specific commit(s) [1] that resolved this issue?  If not, please change resolution to OBE (FIXED is for 'resolve by way of change, usually a commit). If so please assign to committer that resolved and set mfc-* flags to branches that it was merged to. Thanks!

[1] Potentially src b3cfe07d
Comment 14 commit-hook freebsd_committer freebsd_triage 2021-04-18 14:56:01 UTC
A commit in branch stable/12 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=be1fb99a727b784781f7d8d95bb6f14707e2d886

commit be1fb99a727b784781f7d8d95bb6f14707e2d886
Author:     Randall Stewart <rrs@FreeBSD.org>
AuthorDate: 2021-01-27 18:32:52 +0000
Commit:     Alexander V. Chernikov <melifaro@FreeBSD.org>
CommitDate: 2021-04-18 14:54:14 +0000

    When we are about to send down to the driver layer
    we need to make sure that the m_nextpkt field is NULL
    else the lower layers may do unwanted things.

    Reviewed By:  gallatin, melifaro
    Differential Revision: https://reviews.freebsd.org/D28377
    PR:     253587

    (cherry picked from commit 24a8f6d369962f189ad808f538029179b1e7dc2f)

 sys/netinet6/nd6.c | 1 +
 1 file changed, 1 insertion(+)
Comment 15 commit-hook freebsd_committer freebsd_triage 2021-04-18 15:00:04 UTC
A commit in branch stable/11 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=f974af4e59ee533906437b22a5b8b41219bde005

commit f974af4e59ee533906437b22a5b8b41219bde005
Author:     Randall Stewart <rrs@FreeBSD.org>
AuthorDate: 2021-01-27 18:32:52 +0000
Commit:     Alexander V. Chernikov <melifaro@FreeBSD.org>
CommitDate: 2021-04-18 14:57:22 +0000

    When we are about to send down to the driver layer
    we need to make sure that the m_nextpkt field is NULL
    else the lower layers may do unwanted things.

    Reviewed By:  gallatin, melifaro
    Differential Revision: https://reviews.freebsd.org/D28377
    PR:     253587

    (cherry picked from commit 24a8f6d369962f189ad808f538029179b1e7dc2f)

 sys/netinet6/nd6.c | 1 +
 1 file changed, 1 insertion(+)