Bug 255104 - FreeBSD 13.0-RELEASE panic/crash with ipfw/dummynet/divert & wlan
Summary: FreeBSD 13.0-RELEASE panic/crash with ipfw/dummynet/divert & wlan
Status: Closed DUPLICATE of bug 254015
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.0-STABLE
Hardware: Any Any
: --- Affects Some People
Assignee: Mark Johnston
URL:
Keywords: crash, regression
: 255295 (view as bug list)
Depends on:
Blocks:
 
Reported: 2021-04-16 00:43 UTC by Joshua Kinard
Modified: 2021-06-29 20:24 UTC (History)
5 users (show)

See Also:


Attachments
Config for my CUSTOM-13_0 kernel (autogenerated dump from a crash) (2.63 KB, text/plain)
2021-04-16 00:45 UTC, Joshua Kinard
no flags Details
compressed crashlog (29.04 KB, application/vnd.rar)
2021-04-25 05:13 UTC, Michael Meiszl
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Joshua Kinard 2021-04-16 00:43:02 UTC
I have upgraded my router appliance to FreeBSD 13.0-RELEASE and when using IPFW + dummynet(4) + divert(4), I can trigger the kernel to panic in a very random fashion.

Background on my setup:

  - Hardware is a Protectli FW6C (https://protectli.com/product/fw6c/)
    * 16GB RAM
    * KINGSTON SUV500MS120G on /dev/ada0
    * 6x Intel 82583V GbE network ports supported by em(4) [em0 to em5]
    * Custom-added Qualcom AR9462 on ath0/wlan0

  - Custom kernel config installed in /boot/kernel.custom
    * Also a /boot/CUSTOM symlink pointing to /boot/kernel.custom
  - em0 is WAN, DHCP via dhclient(8) to my cable modem
  - em1 is LAN, connected to a Netgear switch
  - wlan0 is wireless LAN on a separate RFC1918 subnet from em1
  - Firewall setup is IPFW-based
    * Uses in-kernel NAT for em1 and wlan0 subnets
    * Uses dummynet(4) for fq_codel shaping
    * Uses divert(4) socket to route packets to Snort for inline inspection

Synopsis of what causes the crash:

  - Having Snort up and running in a tmux session
  - wlan0 is active and has a client station connected
  - ipfw divert(4) socket is active, feeding packets to Snort
  - Sending/receiving WLAN traffic will eventually cause a random panic/reboot
  - Traffic on the LAN on em1 does NOT appear to trigger a crash (note, see crash #4)

Here are samples of the crashes.  I do not have the original kernel for some of these, so I cannot generate full backtraces, but I do have several of the core dumps under /var/crash.  Let me know what is needed to help debug this.  Note, I feel that the issue highlighted in PR#255069 may be related somehow.  I also tried patch D29772 posted in PR#255041, and that had no effect.  Crash #6 is using this patched kernel, so I can run kgdb against it if needed.

Crash #1 (Only kgdb backtrace is available):
    #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
    #1  doadump (textdump=<optimized out>) at ../../../kern/kern_shutdown.c:399
    #2  0xffffffff8074e645 in kern_reboot (howto=260) at ../../../kern/kern_shutdown.c:486
    #3  0xffffffff8074eac0 in vpanic (fmt=<optimized out>, ap=<optimized out>) at ../../../kern/kern_shutdown.c:919
    #4  0xffffffff8074e8c3 in panic (fmt=<unavailable>) at ../../../kern/kern_shutdown.c:843
    #5  0xffffffff80ad2037 in trap_fatal (frame=0xfffffe00dc46d8e0, eva=8) at ../../../amd64/amd64/trap.c:915
    #6  0xffffffff80ad2089 in trap_pfault (frame=frame@entry=0xfffffe00dc46d8e0, usermode=false, signo=<optimized out>, signo@entry=0x0, ucode=<optimized out>, ucode@entry=0x0) at ../../../amd64/amd64/trap.c:732
    #7  0xffffffff80ad1709 in trap (frame=0xfffffe00dc46d8e0) at ../../../amd64/amd64/trap.c:398
    #8  <signal handler called>
    #9  0xffffffff814f00a5 in dummynet_task () from /boot/CUSTOM/dummynet.ko
    #10 0xffffffff807aeda1 in taskqueue_run_locked (queue=0x8962c, queue@entry=0xfffff8000b02d300) at ../../../kern/subr_taskqueue.c:476
    #11 0xffffffff807b00bc in taskqueue_thread_loop (arg=<optimized out>, arg@entry=0xffffffff814fa048 <dn_tq>) at ../../../kern/subr_taskqueue.c:793
    #12 0xffffffff8070e05d in fork_exit (callout=0xffffffff807b0010 <taskqueue_thread_loop>, arg=0xffffffff814fa048 <dn_tq>, frame=0xfffffe00dc46db00) at ../../../kern/kern_fork.c:1069
    #13 <signal handler called>


Crash #2 (kgdb backtrace data unavailable):
    Fatal trap 12: page fault while in kernel mode
    cpuid = 0; apic id = 00
    fault virtual address   = 0x8
    fault code              = supervisor read data, page not present
    instruction pointer     = 0x20:0xffffffff814f00a5
    stack pointer           = 0x28:0xfffffe00dc46d9a0
    frame pointer           = 0x28:0xfffffe00dc46da00
    code segment            = base rx0, limit 0xfffff, type 0x1b
                            = DPL 0, pres 1, long 1, def32 0, gran 1
    processor eflags        = interrupt enabled, resume, IOPL = 0
    current process         = 0 (dummynet)
    trap number             = 12
    panic: page fault
    cpuid = 0
    time = 1618402444
    KDB: stack backtrace:
    #0 0xffffffff8079b0b5 at kdb_backtrace+0x65
    #1 0xffffffff8074ea51 at vpanic+0x181
    #2 0xffffffff8074e8c3 at panic+0x43
    #3 0xffffffff80ad2037 at trap_fatal+0x387
    #4 0xffffffff80ad2089 at trap_pfault+0x49
    #5 0xffffffff80ad1709 at trap+0x259
    #6 0xffffffff80aaa4e8 at calltrap+0x8
    #7 0xffffffff807aeda1 at taskqueue_run_locked+0x181
    #8 0xffffffff807b00bc at taskqueue_thread_loop+0xac
    #9 0xffffffff8070e05d at fork_exit+0x7d
    #10 0xffffffff80aab4ee at fork_trampoline+0xe
    Uptime: 9m23s
    Dumping 787 out of 16144 MB: (CTRL-C to abort) ..3%..11%..21%..31%..41%..51%..61%..72%..82%..92%


Crash #3 (this happened when sending Ctrl+C to the Snort process):
    Fatal trap 12: page fault while in kernel mode
    cpuid = 0; apic id = 00
    fault virtual address   = 0x8
    fault code              = supervisor read data, page not present
    instruction pointer     = 0x20:0xffffffff807ec20c
    stack pointer           = 0x28:0xfffffe011d7d07d0
    frame pointer           = 0x28:0xfffffe011d7d0810
    code segment            = base rx0, limit 0xfffff, type 0x1b
                            = DPL 0, pres 1, long 1, def32 0, gran 1
    processor eflags        = interrupt enabled, resume, IOPL = 0
    current process         = 86334 (snort)
    trap number             = 12
    panic: page fault
    cpuid = 0
    time = 1618439898
    KDB: stack backtrace:
    #0 0xffffffff8079e8f5 at kdb_backtrace+0x65
    #1 0xffffffff80752291 at vpanic+0x181
    #2 0xffffffff80752103 at panic+0x43
    #3 0xffffffff80b05a37 at trap_fatal+0x387
    #4 0xffffffff80b05a89 at trap_pfault+0x49
    #5 0xffffffff80b05109 at trap+0x259
    #6 0xffffffff80addee8 at calltrap+0x8
    #7 0xffffffff807eaf68 at sbdestroy+0x18
    #8 0xffffffff807edd39 at sofree+0x309
    #9 0xffffffff807ee824 at soclose+0x2e4
    #10 0xffffffff806f8a91 at _fdrop+0x11
    #11 0xffffffff806fbdcb at closef+0x24b
    #12 0xffffffff806f8d92 at closefp+0x82
    #13 0xffffffff80b0621c at amd64_syscall+0x10c
    #14 0xffffffff80ade80e at fast_syscall_common+0xf8
    Uptime: 21m57s
    Dumping 786 out of 16146 MB:..3%..11%..21%..31%..41%..51%..62%..72%..82%..92%
    
    __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
    55	/usr/src/sys/amd64/include/pcpu_aux.h: No such file or directory.
    (kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
    #1  doadump (textdump=<optimized out>) at ../../../kern/kern_shutdown.c:399
    #2  0xffffffff80751e85 in kern_reboot (howto=260)
        at ../../../kern/kern_shutdown.c:486
    #3  0xffffffff80752300 in vpanic (fmt=<optimized out>, ap=<optimized out>)
        at ../../../kern/kern_shutdown.c:919
    #4  0xffffffff80752103 in panic (fmt=<unavailable>)
        at ../../../kern/kern_shutdown.c:843
    #5  0xffffffff80b05a37 in trap_fatal (frame=0xfffffe011d7d0710, eva=8)
        at ../../../amd64/amd64/trap.c:915
    #6  0xffffffff80b05a89 in trap_pfault (frame=frame@entry=0xfffffe011d7d0710, 
        usermode=false, signo=<optimized out>, signo@entry=0x0, 
        ucode=<optimized out>, ucode@entry=0x0) at ../../../amd64/amd64/trap.c:732
    #7  0xffffffff80b05109 in trap (frame=0xfffffe011d7d0710)
        at ../../../amd64/amd64/trap.c:398
    #8  <signal handler called>
    #9  sbcut_internal (sb=sb@entry=0xfffff802fa2d68a8, len=3404)
        at ../../../kern/uipc_sockbuf.c:1491
    #10 0xffffffff807eaf68 in sbflush_internal (sb=0xfffff802fa2d68a8, 
        sb@entry=0xfffff802fa2d6760) at ../../../kern/uipc_sockbuf.c:1431
    #11 sbrelease_internal (sb=0xfffff802fa2d68a8, sb@entry=0xfffff802fa2d6760, 
        so=0xfffff802fa2d6760, so@entry=0xfffff802fa2d68a8)
        at ../../../kern/uipc_sockbuf.c:721
    #12 sbdestroy (sb=sb@entry=0xfffff802fa2d68a8, so=so@entry=0xfffff802fa2d6760)
        at ../../../kern/uipc_sockbuf.c:749
    #13 0xffffffff807edd39 in sofree (so=so@entry=0xfffff802fa2d6760)
        at ../../../kern/uipc_socket.c:1158
    #14 0xffffffff807ee824 in soclose (so=0xfffff802fa2d6760)
        at ../../../kern/uipc_socket.c:1235
    #15 0xffffffff806f8a91 in fo_close (fp=fp@entry=0xfffff80010895500, td=0xd4c, 
        td@entry=0xfffffe012053a000) at ../../../sys/file.h:377
    #16 _fdrop (fp=fp@entry=0xfffff80010895500, td=0xd4c, 
        td@entry=0xfffffe012053a000) at ../../../kern/kern_descrip.c:3510
    #17 0xffffffff806fbdcb in closef (fp=fp@entry=0xfffff80010895500, 
        td=td@entry=0xfffffe012053a000) at ../../../kern/kern_descrip.c:2828
    #18 0xffffffff806f8d92 in closefp_impl (fdp=<optimized out>, fd=4, 
        fp=0xfffff80010895500, td=0xfffffe012053a000, audit=true)
        at ../../../kern/kern_descrip.c:1271
    #19 closefp (fdp=<optimized out>, fd=4, fp=0xfffff80010895500, 
        td=0xfffffe012053a000, holdleaders=<optimized out>, audit=true)
        at ../../../kern/kern_descrip.c:1328
    #20 0xffffffff80b0621c in syscallenter (td=0xfffffe012053a000)
        at ../../../amd64/amd64/../../kern/subr_syscall.c:189
    #21 amd64_syscall (td=0xfffffe012053a000, traced=0)
        at ../../../amd64/amd64/trap.c:1156
    #22 <signal handler called>
    #23 0x000000080915b40a in ?? ()
    Backtrace stopped: Cannot access memory at address 0x7fffff4b1458


Crash #4 (based on the stacktrace, this may have been caused by emX traffic):
    NOTE: I use an out-of-tree copy of em-7.7.8 from Intel upstream, modifed
          to compile under FreeBSD 13.0 (changes are trivial).
    Fatal trap 9: general protection fault while in kernel mode
    cpuid = 1; apic id = 02
    instruction pointer     = 0x20:0xffffffff8086e9dc
    stack pointer           = 0x28:0xfffffe00c5b9f840
    frame pointer           = 0x28:0xfffffe00c5b9f890
    code segment            = base rx0, limit 0xfffff, type 0x1b
                            = DPL 0, pres 1, long 1, def32 0, gran 1
    processor eflags        = interrupt enabled, resume, IOPL = 0
    current process         = 0 (em0 que)
    trap number             = 9
    panic: general protection fault
    cpuid = 1
    time = 1618440500
    KDB: stack backtrace:
    #0 0xffffffff8079e8f5 at kdb_backtrace+0x65
    #1 0xffffffff80752291 at vpanic+0x181
    #2 0xffffffff80752103 at panic+0x43
    #3 0xffffffff80b05a37 at trap_fatal+0x387
    #4 0xffffffff80b055cf at trap+0x71f
    #5 0xffffffff80addee8 at calltrap+0x8
    #6 0xffffffff8088c488 at netisr_dispatch_src+0xc8
    #7 0xffffffff8086ddd9 at ether_input+0x69
    #8 0xffffffff8086a69a at if_input+0xa
    #9 0xffffffff81b1f000 at em_rxeof+0x260
    #10 0xffffffff81b20380 at em_handle_que+0x40
    #11 0xffffffff807b25e1 at taskqueue_run_locked+0x181
    #12 0xffffffff807b38fc at taskqueue_thread_loop+0xac
    #13 0xffffffff8071189d at fork_exit+0x7d
    #14 0xffffffff80adeeee at fork_trampoline+0xe
    Uptime: 9m14s
    Dumping 819 out of 16146 MB:..2%..12%..22%..32%..42%..51%..61%..71%..81%..92%
    
    __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
    55	/usr/src/sys/amd64/include/pcpu_aux.h: No such file or directory.
    (kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
    #1  doadump (textdump=<optimized out>) at ../../../kern/kern_shutdown.c:399
    #2  0xffffffff80751e85 in kern_reboot (howto=260)
        at ../../../kern/kern_shutdown.c:486
    #3  0xffffffff80752300 in vpanic (fmt=<optimized out>, ap=<optimized out>)
        at ../../../kern/kern_shutdown.c:919
    #4  0xffffffff80752103 in panic (fmt=<unavailable>)
        at ../../../kern/kern_shutdown.c:843
    #5  0xffffffff80b05a37 in trap_fatal (frame=0xfffffe00c5b9f780, eva=0)
        at ../../../amd64/amd64/trap.c:915
    #6  0xffffffff80b055cf in trap (frame=0xfffffe00c5b9f780)
        at ../../../amd64/amd64/trap.c:576
    #7  <signal handler called>
    #8  ether_input_internal (ifp=0x5f48844900310210, m=0xfffff8039a9e9d00)
        at ../../../net/if_ethersubr.c:524
    #9  ether_nh_input (m=0xfffff8039a9e9d00) at ../../../net/if_ethersubr.c:739
    #10 0xffffffff8088c488 in netisr_dispatch_src (proto=proto@entry=5, 
        source=<optimized out>, source@entry=0, m=m@entry=0xfffff8039a9e9d00)
        at ../../../net/netisr.c:1143
    #11 0xffffffff8088c76f in netisr_dispatch (proto=2594086144, proto@entry=5, 
        m=0x2d, m@entry=0xfffff8039a9e9d00) at ../../../net/netisr.c:1234
    #12 0xffffffff8086ddd9 in ether_input (ifp=<optimized out>, 
        m=0xfffff8039a9e9d00) at ../../../net/if_ethersubr.c:830
    #13 0xffffffff8086a69a in if_input (ifp=0xfffff8039a9e9d00, sendmp=0x0)
        at ../../../net/if.c:4391
    #14 0xffffffff81b1f000 in em_rxeof () from /boot/modules/if_em_updated.ko
    #15 0xffffffff81b20380 in em_handle_que () from /boot/modules/if_em_updated.ko
    #16 0xffffffff807b25e1 in taskqueue_run_locked (queue=0xfffff80017500200, 
        queue@entry=0xfffff80002bdfa00) at ../../../kern/subr_taskqueue.c:476
    #17 0xffffffff807b38fc in taskqueue_thread_loop (arg=<optimized out>, 
        arg@entry=0xfffffe002014e6a0) at ../../../kern/subr_taskqueue.c:793
    #18 0xffffffff8071189d in fork_exit (
        callout=0xffffffff807b3850 <taskqueue_thread_loop>, 
        arg=0xfffffe002014e6a0, frame=0xfffffe00c5b9fb00)
        at ../../../kern/kern_fork.c:1069
    #19 <signal handler called>


Crash #5:
    Fatal trap 12: page fault while in kernel mode
    cpuid = 1; apic id = 02
    fault virtual address   = 0x0
    fault code              = supervisor read data, page not present
    instruction pointer     = 0x20:0xffffffff8047ae0d
    stack pointer           = 0x28:0xfffffe001d3fc550
    frame pointer           = 0x28:0xfffffe001d3fc590
    code segment            = base rx0, limit 0xfffff, type 0x1b
                            = DPL 0, pres 1, long 1, def32 0, gran 1
    processor eflags        = interrupt enabled, resume, IOPL = 0
    current process         = 12 (swi1: netisr 1)
    trap number             = 12
    panic: page fault
    cpuid = 1
    time = 1618441084
    KDB: stack backtrace:
    #0 0xffffffff8079e8f5 at kdb_backtrace+0x65
    #1 0xffffffff80752291 at vpanic+0x181
    #2 0xffffffff80752103 at panic+0x43
    #3 0xffffffff80b05a37 at trap_fatal+0x387
    #4 0xffffffff80b05a89 at trap_pfault+0x49
    #5 0xffffffff80b05109 at trap+0x259
    #6 0xffffffff80addee8 at calltrap+0x8
    #7 0xffffffff808a73a3 at ieee80211_parent_xmitpkt+0x13
    #8 0xffffffff808b988e at ieee80211_vap_pkt_send_dest+0x25e
    #9 0xffffffff808ba606 at ieee80211_vap_transmit+0x1d6
    #10 0xffffffff8086d82b at ether_output_frame+0xab
    #11 0xffffffff8086d727 at ether_output+0x6b7
    #12 0xffffffff808eb2e9 at ip_output_send+0x109
    #13 0xffffffff808eb062 at ip_output+0x12a2
    #14 0xffffffff808e8164 at ip_forward+0x394
    #15 0xffffffff808e7d89 at ip_input+0x6c9
    #16 0xffffffff8088cc1b at swi_net+0x12b
    #17 0xffffffff80714abd at ithread_loop+0x24d
    Uptime: 3m18s
    Dumping 849 out of 16146 MB:..2%..12%..21%..31%..42%..51%..61%..72%..81%..91%
    
    __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
    55	/usr/src/sys/amd64/include/pcpu_aux.h: No such file or directory.
    (kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
    #1  doadump (textdump=<optimized out>) at ../../../kern/kern_shutdown.c:399
    #2  0xffffffff80751e85 in kern_reboot (howto=260)
        at ../../../kern/kern_shutdown.c:486
    #3  0xffffffff80752300 in vpanic (fmt=<optimized out>, ap=<optimized out>)
        at ../../../kern/kern_shutdown.c:919
    #4  0xffffffff80752103 in panic (fmt=<unavailable>)
        at ../../../kern/kern_shutdown.c:843
    #5  0xffffffff80b05a37 in trap_fatal (frame=0xfffffe001d3fc490, eva=0)
        at ../../../amd64/amd64/trap.c:915
    #6  0xffffffff80b05a89 in trap_pfault (frame=frame@entry=0xfffffe001d3fc490, 
        usermode=false, signo=<optimized out>, signo@entry=0x0, 
        ucode=<optimized out>, ucode@entry=0x0) at ../../../amd64/amd64/trap.c:732
    #7  0xffffffff80b05109 in trap (frame=0xfffffe001d3fc490)
        at ../../../amd64/amd64/trap.c:398
    #8  <signal handler called>
    #9  ath_transmit (ic=<optimized out>, m=0xfffff801ed556200)
        at ../../../dev/ath/if_ath.c:3516
    #10 0xffffffff808a73a3 in ieee80211_parent_xmitpkt (ic=0x0, 
        ic@entry=0xfffffe00d844f000, m=m@entry=0xfffff8001e808300)
        at ../../../net80211/ieee80211_freebsd.c:717
    #11 0xffffffff808b988e in ieee80211_vap_pkt_send_dest (
        vap=vap@entry=0xfffff8001e266000, m=m@entry=0xfffff8001e808300, 
        ni=ni@entry=0xfffffe012c7b1000)
        at ../../../net80211/ieee80211_output.c:317
    #12 0xffffffff808ba606 in ieee80211_start_pkt (vap=0xfffff8001e266000, 
        m=0xfffff8001e808300) at ../../../net80211/ieee80211_output.c:474
    #13 ieee80211_vap_transmit (ifp=<optimized out>, m=<optimized out>)
        at ../../../net80211/ieee80211_output.c:534
    #14 0xffffffff8086d82b in ether_output_frame (
        ifp=ifp@entry=0xfffff8001e188000, m=0xfffffe012c7b1000)
        at ../../../net/if_ethersubr.c:511
    #15 0xffffffff8086d727 in ether_output (ifp=<optimized out>, 
        m=0xfffffe012c7b1000, dst=0xfffffe001d3fc8e0, ro=<optimized out>)
        at ../../../net/if_ethersubr.c:438
    #16 0xffffffff808eb2e9 in ip_output_send (inp=inp@entry=0x0, 
        ifp=0xfffff8001e188000, m=m@entry=0xfffff8001e808300, gw=<optimized out>, 
        gw@entry=0xfffffe001d3fc8e0, ro=<optimized out>, 
        ro@entry=0xfffffe001d3fc8c0, stamp_tag=<optimized out>)
        at ../../../netinet/ip_output.c:275
    #17 0xffffffff808eb062 in ip_output (m=m@entry=0xfffff8001e808300, 
        opt=<optimized out>, opt@entry=0x0, ro=<optimized out>, 
        ro@entry=0xfffffe001d3fc8c0, flags=flags@entry=1, imo=imo@entry=0x0, 
        inp=<optimized out>, inp@entry=0x0) at ../../../netinet/ip_output.c:812
    #18 0xffffffff808e8164 in ip_forward (m=0xfffff8001e808300, 
        srcrt=<optimized out>) at ../../../netinet/ip_input.c:1067
    #19 0xffffffff808e7d89 in ip_input (m=0x0) at ../../../netinet/ip_input.c:789
    #20 0xffffffff8088cc1b in netisr_process_workstream_proto (
        nwsp=<optimized out>, proto=1) at ../../../net/netisr.c:919
    #21 swi_net (arg=<optimized out>) at ../../../net/netisr.c:966
    #22 0xffffffff80714abd in intr_event_execute_handlers (p=<optimized out>, 
        ie=0xfffff80002826b00) at ../../../kern/kern_intr.c:1168
    #23 ithread_execute_handlers (p=<optimized out>, ie=0xfffff80002826b00)
        at ../../../kern/kern_intr.c:1181
    #24 ithread_loop (arg=arg@entry=0xfffff80002833ac0)
        at ../../../kern/kern_intr.c:1269
    #25 0xffffffff8071189d in fork_exit (
        callout=0xffffffff80714870 <ithread_loop>, arg=0xfffff80002833ac0, 
        frame=0xfffffe001d3fcb00) at ../../../kern/kern_fork.c:1069
    #26 <signal handler called>


Crash #6:
    Fatal trap 12: page fault while in kernel mode
    cpuid = 1; apic id = 02
    fault virtual address   = 0x388
    fault code              = supervisor read data, page not present
    instruction pointer     = 0x20:0xffffffff8088cc07
    stack pointer           = 0x28:0xfffffe001d3fc9c0
    frame pointer           = 0x28:0xfffffe001d3fca20
    code segment            = base rx0, limit 0xfffff, type 0x1b
                            = DPL 0, pres 1, long 1, def32 0, gran 1
    processor eflags        = interrupt enabled, resume, IOPL = 0
    current process         = 12 (swi1: netisr 1)
    trap number             = 12
    panic: page fault
    cpuid = 1
    time = 1618528473
    KDB: stack backtrace:
    #0 0xffffffff8079e8f5 at kdb_backtrace+0x65
    #1 0xffffffff80752291 at vpanic+0x181
    #2 0xffffffff80752103 at panic+0x43
    #3 0xffffffff80b05d07 at trap_fatal+0x387
    #4 0xffffffff80b05d59 at trap_pfault+0x49
    #5 0xffffffff80b053d9 at trap+0x259
    #6 0xffffffff80ade1b8 at calltrap+0x8
    #7 0xffffffff80714abd at ithread_loop+0x24d
    #8 0xffffffff8071189d at fork_exit+0x7d
    #9 0xffffffff80adf1be at fork_trampoline+0xe
    Uptime: 2m28s
    Dumping 781 out of 16146 MB:..3%..11%..21%..31%..41%..52%..62%..72%..82%..91%
    
    __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
    55	/usr/src/sys/amd64/include/pcpu_aux.h: No such file or directory.
    (kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
    #1  doadump (textdump=<optimized out>) at ../../../kern/kern_shutdown.c:399
    #2  0xffffffff80751e85 in kern_reboot (howto=260)
        at ../../../kern/kern_shutdown.c:486
    #3  0xffffffff80752300 in vpanic (fmt=<optimized out>, ap=<optimized out>)
        at ../../../kern/kern_shutdown.c:919
    #4  0xffffffff80752103 in panic (fmt=<unavailable>)
        at ../../../kern/kern_shutdown.c:843
    #5  0xffffffff80b05d07 in trap_fatal (frame=0xfffffe001d3fc900, eva=904)
        at ../../../amd64/amd64/trap.c:915
    #6  0xffffffff80b05d59 in trap_pfault (frame=frame@entry=0xfffffe001d3fc900, 
        usermode=false, signo=<optimized out>, signo@entry=0x0, 
        ucode=<optimized out>, ucode@entry=0x0) at ../../../amd64/amd64/trap.c:732
    #7  0xffffffff80b053d9 in trap (frame=0xfffffe001d3fc900)
        at ../../../amd64/amd64/trap.c:398
    #8  <signal handler called>
    #9  0xffffffff8088cc07 in netisr_process_workstream_proto (
        nwsp=<optimized out>, proto=1) at ../../../net/netisr.c:918
    #10 swi_net (arg=<optimized out>) at ../../../net/netisr.c:966
    #11 0xffffffff80714abd in intr_event_execute_handlers (p=<optimized out>, 
        ie=0xfffff80002826b00) at ../../../kern/kern_intr.c:1168
    #12 ithread_execute_handlers (p=<optimized out>, ie=0xfffff80002826b00)
        at ../../../kern/kern_intr.c:1181
    #13 ithread_loop (arg=arg@entry=0xfffff80002833ac0)
        at ../../../kern/kern_intr.c:1269
    #14 0xffffffff8071189d in fork_exit (
        callout=0xffffffff80714870 <ithread_loop>, arg=0xfffff80002833ac0, 
        frame=0xfffffe001d3fcb00) at ../../../kern/kern_fork.c:1069
    #15 <signal handler called>

-----------------------------------------------------------------------

I suspect the underlying flaw is somehow tied to an interaction with divert(8) and dummynet(8) and the wlan0 adapter.  Standard LAN traffic does not seem to trigger the panic, or at least trigger it as easily.  But WLAN traffic does trigger it very easily, usually within a minute or two of turning on the divert(8) rule, connecting a wireless station, and generating some wireless traffic.  I also suspect Snort is applying memory pressure somehow.  I am using the standard Talos ruleset (30-day delayed release, several months old).

This is how I start Snort-2.9.17:
snort -c /usr/local/etc/snort/snort.conf -i em0 -k none -A console -Q --daq ipfw --daq-mode inline --daq-var port=8000

And this is the divert(8) rule:
ipfw add 00049 divert 8000 all from any to any via em0

This is my NAT/dummynet configuration from the firewall:
/sbin/ipfw nat 1 config if em0 deny_in same_ports unreg_only reset
/sbin/ipfw pipe 1 config bw 294MBit/s burst 1048576        # Download pipe
/sbin/ipfw pipe 2 config bw 12MBit/s                       # Upload pipe
/sbin/ipfw sched 1 config pipe 1 type fq_codel target 5ms quantum 6000 flows 2048 interval 300 limit 15360 ecn
/sbin/ipfw sched 2 config pipe 2 type fq_codel ecn
/sbin/ipfw queue 01 config sched 2 weight 100              # Outbound TCP ACK
/sbin/ipfw queue 02 config sched 1 weight 100              # Inbound TCP ACK
/sbin/ipfw queue 03 config sched 2 weight  90              # Outbound HTTP/HTTPS/RSYNC
/sbin/ipfw queue 04 config sched 1 weight  90              # Inbound HTTP/HTTPS/RSYNC
/sbin/ipfw queue 05 config sched 2 weight  85              # Outbound DNS
/sbin/ipfw queue 06 config sched 1 weight  85              # Inbound DNS
/sbin/ipfw queue 07 config sched 2 weight  65              # Outbound Steam Client
/sbin/ipfw queue 08 config sched 1 weight  65              # Inbound Steam Client
/sbin/ipfw queue 09 config sched 2 weight  55              # Outbound IMAP/POP3/SMTP
/sbin/ipfw queue 10 config sched 1 weight  55              # Inbound IMAP/POP3/SMTP

That's about all I can think that is relevant.  Please let me know if any additional information is needed.  The system is rolled back to FreeBSD 12.2, but I am keeping the FreeBSD 13.0 boot environment, so I can easily reboot into 13.0 and try any patches out.
Comment 1 Joshua Kinard 2021-04-16 00:45:44 UTC
Created attachment 224146 [details]
Config for my CUSTOM-13_0 kernel (autogenerated dump from a crash)

Copied from one of the /var/crash/core.txt.* files
Comment 2 Michael Meiszl 2021-04-21 09:44:30 UTC
(In reply to Joshua Kinard from comment #1)
I just reported an error that seems to be alike to yours.
Here its more than just a router, so its harder to locate. But basically I have also tracked it down to ipfw panicing on incoming packets.

Fatal trap 12: page fault while in kernel mode
cpuid = 4; apic id = 04
fault virtual address	= 0x388
fault code		= supervisor read data, page not present
instruction pointer	= 0x20:0xffffffff80d3fa67
stack pointer	        = 0x28:0xfffffe00df2feac0
frame pointer	        = 0x28:0xfffffe00df2feb20
code segment		= base rx0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 12 (swi1: netisr 0)
trap number		= 12
panic: page fault
cpuid = 4
time = 1618988377
KDB: stack backtrace:
#0 0xffffffff80c57345 at kdb_backtrace+0x65
#1 0xffffffff80c09d21 at vpanic+0x181
#2 0xffffffff80c09b93 at panic+0x43
#3 0xffffffff8108b187 at trap_fatal+0x387
#4 0xffffffff8108b1df at trap_pfault+0x4f
#5 0xffffffff8108a83d at trap+0x27d
#6 0xffffffff810617a8 at calltrap+0x8
#7 0xffffffff80bcae5d at ithread_loop+0x24d
#8 0xffffffff80bc7c5e at fork_exit+0x7e
#9 0xffffffff8106282e at fork_trampoline+0xe
Uptime: 2m6s

Runtime is random (seems to depend on how much traffic is incoming and since this is a central router and tunnel endpoint, it does not take long to crash)

The system stays alive with ipfw disabled, but of course, this is no valid option for this machine

(at least we are TWO now with this problem....)
Comment 3 Michael Meiszl 2021-04-21 15:00:50 UTC
*** Bug 255295 has been marked as a duplicate of this bug. ***
Comment 4 Michael Meiszl 2021-04-21 15:04:38 UTC
Because playing around with a mission critical machine is no real option, I have set up some old hardware directly with 13.0 and try to reproduce the crashes. But it will take some time because I have to find some more parts to simulate a heavy trafficing client behind this new router...
Comment 5 Joshua Kinard 2021-04-23 04:12:27 UTC
(In reply to Michael Meiszl from comment #4)

Yeah, in my case, using a divert(4) rule seemed to be part of a trigger condition, and that may be related to a similar bug reported in Bug #255164, which has a commit (652908599b6f) that addresses issues in ipfw and divert w/ the new unmapped mbuf feature.  I tried applying that patch, along with several others (Bug #254309, 703419774f86; Bug #255041, 9bacbf1ae243).  No dice.  Still able to reproduce random kernel crashes (not always a "panic" -- general protection fault this time).
Comment 6 Michael Meiszl 2021-04-23 04:29:44 UTC
the sad news is that my freshly installed testmachine does not show up any problems (yet).
So far, ipfw has not really anything to filter on it, the packets are just passed on.
I will try to create a more challenging setup, maybe even moving the main (real) v6 tunnel to that machine. But this will interrupt all internet services in case of a crash. Maybe I better wait for the weekend (or you might read "admin got killed by rioting users" soon).

My Testhardware:
I7-6400 / 32Gb
NVme 250Gb
3* Realtek (reX) cards
FBSD 13.0 (Generic, installed from ISO/Stick)

the "real" machine:
Ryzen 3400G / 32Gb
NVMe 250Gb / NVMe 2Tb
1* Intel Pro/10Gb (ix0,ix1) 
1* Realtek 1Gb (re0)
FBSD 12.2-RELEASE-p6 (Generic) working, Updated to 13.0 crashing
Comment 7 Jack 2021-04-23 04:33:43 UTC
I'm also getting random crashes with a similar setup

I have these in my custom kernel
device         if_bridge
options        LIBALIAS
options        IPFIREWALL
options        IPFIREWALL_DEFAULT_TO_ACCEPT
options        IPFIREWALL_NAT
options        IPDIVERT
options        IPSTEALTH

My ipfw rules
00101 allow ip from any to any via lo0
00102 divert 8668 ip from any to me in via igb1
00103 divert 8668 ip4 from 10.100.0.0/23 to not me out via igb1
00104 deny ip from any to any 25 via igb0
00200 deny ip from any to 127.0.0.0/8
00300 deny ip from 127.0.0.0/8 to any
65535 allow ip from any to any

rc.conf
natd_enable="YES"
natd_flags="-f /etc/natd.conf"
natd_interface="igb1"
gateway_enable="YES"
firewall_enable="YES"
firewall_type="OPEN"

/etc/natd.conf
use_sockets yes
same_ports yes
dynamic yes

I don't have debug turned on but ever since upgrading from 12.2 to 13.0-STABLE, it has been randomly crashing every few hours. The server is an NFS file server and PXE server so it doesn't see much external traffic, only lots of internal traffic.
Comment 8 Michael Meiszl 2021-04-24 05:27:15 UTC
Updated info: I once more tried to update the "real machine", disabling ipfw during the update process
This worked (like it did before already).
Then I deleted all "table" rules, converting the more than 1000 entries to "normal" rules and restarted ipfw.
This is still working (for some hours now, before the machine die not survive even 5mins).

So I guess, the problem is within the table processing of ipfw.

Let it run for some more days, then we can say if this is the right spot to look after.
Comment 9 Michael Meiszl 2021-04-24 06:39:40 UTC
yeah, I was too optimistic :-(
It worked, but only unless I did a firewall_enable="YES" into rc.conf and rebootet :-(
Soon after the reboot the machine got locked up again, only the reset key helped me out (even the console was dead).

But, what is really strange, it works if I use firewall_enable="NO", reboot, log in manually and do a "service ipfw onestart".
Then it runs for hours and honors all those rules.

I don't see the real difference in those two starting methods, only that everything is up and running already if I start the fw later on.

Maybe there is a strange and yet unknown dependency in 13???
Comment 10 Michael Meiszl 2021-04-25 05:13:27 UTC
Created attachment 224413 [details]
compressed crashlog

compressed crashlog
Comment 11 Michael Meiszl 2021-04-25 05:14:09 UTC
Update: started manually, the machine ran for 24hrs like a charm!

maybe somebody else should try this simple trick (set firewall to NO in rc.conf, reboot, login in and do "service ipfw onestart") and check if it works for him too?

This morning, I was curious and thought, maybe I could get the same effect if I delay the firewall loading at boot time by editing the /etc/rc.d/ipwf file. I have tried "Required: LOGIN" and rebootet.
Sadly, it did not work, the machine crashed after 2mins again :-(

Turning it off and started it manually like described above worked once more, it is up and running for over an hour now already (and I dont expect it to crash anymore).
But of course, this is not really an option. Usually the box runs forever unless a major update comes in and brings a new kernel. Then it reboots automatically and I now always have to remember to start the firewall afterwards by hand.

I have supplied the latest kernel crash listing here as an attachment, maybe somebody can take a look at it and find out what is happening? I have some more boxes to update but I wont start unless this bug is fixed.
Comment 12 Jack 2021-04-26 19:10:03 UTC
My machine has been crashing nonstop every few hours. I changed net.link.ether.ipfw=1 to 0 and it hasn't crashed in 19 hours so far. Maybe related to that sysctl?
Comment 13 Jack 2021-04-26 20:15:40 UTC
I recall previous statement about net.link.ether.ipfw=0 not causing crash, the server just mysteriously crashed again even with that setting changed.
Comment 14 Mark Johnston freebsd_committer freebsd_triage 2021-05-10 13:52:35 UTC
I suspect this bug is the same as the one fixed by https://cgit.freebsd.org/src/commit/?id=a1fadf7de25b973a308b86d04c4ada4fa8be193f .  I've merged it to stable/13 - is anyone experiencing the problem able to test this, either by updating to the latest stable/13 or applying the commit directly to 13.0?
Comment 15 Michael Meiszl 2021-05-10 14:13:46 UTC
I would volunteer to test it, but "freebsd-update fetch" does not offer me anything to patch !?!?!?

What am I supposed to do to test it?
Comment 16 Mark Johnston freebsd_committer freebsd_triage 2021-05-10 14:28:03 UTC
(In reply to Michael Meiszl from comment #15)
freebsd-update can only be used to update to a new release or apply security/errata patches.  The aforementioned patch will be released as an erratum but that hasn't happened yet.

To test the patch you'd need to check out the FreeBSD 13.0 or stable/13 sources, apply the patch, and build a new kernel.  For instance:

$ git clone --branch=releng/13.0 https://git.freebsd.org/src.git ~/src
$ cd ~/src
$ git cherry-pick a1fadf7de25b973a308b86d04c4ada4fa8be193f
$ make buildkernel
$ sudo make installkernel
$ <reboot>

See also https://docs.freebsd.org/en/books/handbook/cutting-edge/
Comment 17 Michael Meiszl 2021-05-10 17:41:00 UTC
followed your orders. installed git, fetched everything and built a new kernel 
(the wanted changes are included, but the file is not really the same as the other article showed, so there must be more/other/different changes too)

Booting the new kernel worked (without firewall), it ran for an hour without problems (ipfw started manually with "onestart" as said before already).

Then I became bold and dared to write 'firewall_enable="YES"' into /etc/rc.conf and rebooted.

Sadly, like before, the machine did not survive more than 2mins again :-(
As before, the console shows "ipfw: pullup failed" one or more times, network connections are killed and then kernel does an automatic reboot.

As usual I have a crashlog, but this wont help much with this special kernel I am afraid.

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address   = 0x8
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80ca599c
stack pointer           = 0x0:0xfffffe00349ba530
frame pointer           = 0x0:0xfffffe00349ba570
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (if_io_tqg_2)
trap number             = 12
panic: page fault
cpuid = 2
time = 1620667695
KDB: stack backtrace:
#0 0xffffffff80c57345 at kdb_backtrace+0x65
#1 0xffffffff80c09d21 at vpanic+0x181
#2 0xffffffff80c09b93 at panic+0x43
#3 0xffffffff8108b187 at trap_fatal+0x387
#4 0xffffffff8108b1df at trap_pfault+0x4f
#5 0xffffffff8108a83d at trap+0x27d
#6 0xffffffff810617a8 at calltrap+0x8
#7 0xffffffff80dbf0ae at tcp_do_segment+0x10ce
#8 0xffffffff80dbd21e at tcp_input+0xabe
#9 0xffffffff80dafc15 at ip_input+0x125
#10 0xffffffff80d3f2da at netisr_dispatch_src+0xca
#11 0xffffffff80d23a68 at ether_demux+0x148
#12 0xffffffff80d24dec at ether_nh_input+0x34c
#13 0xffffffff80d3f2da at netisr_dispatch_src+0xca
#14 0xffffffff80d23eb9 at ether_input+0x69
#15 0xffffffff80d3ba03 at iflib_rxeof+0xc63
#16 0xffffffff80d35d42 at _task_fn_rx+0x72
#17 0xffffffff80c55dad at gtaskqueue_run_locked+0x15d
Uptime: 1m28s

So, sadly, the bug is NOT FIXED with this change :-(
Too bad.
Comment 18 Michael Meiszl 2021-05-10 17:44:12 UTC
maybe someone has an educated guess what the difference for ipfw is between started from rc script at boot time or started later on by root from the console???

The only thing I can think off is a missing/wrong permission on a memory buffer or something.
Comment 19 Mark Johnston freebsd_committer freebsd_triage 2021-05-10 17:47:28 UTC
(In reply to Michael Meiszl from comment #18)
Assuming you do not already have it on, please compile an INVARIANTS kernel following the steps here, and try again: https://lists.freebsd.org/pipermail/freebsd-net/2021-May/058280.html
Comment 20 Michael Meiszl 2021-05-10 17:54:19 UTC
No, there was no INVARIANT stuff in the git-ted sources.
I added them now an restart from scratch...
will report later.
can take some time, the machine is needed for the next 2 hours, no risk, no reboot...
Comment 21 Joshua Kinard 2021-05-10 18:23:13 UTC
(In reply to Mark Johnston from comment #14)
> I suspect this bug is the same as the one fixed by
> https://cgit.freebsd.org/src/commit/?id=a1fadf7de25b973a308b86d04c4ada4fa8be193f
> .  I've merged it to stable/13 - is anyone experiencing the problem able
> to test this, either by updating to the latest stable/13 or applying the
> commit directly to 13.0?

I am running a rebuild now of the kernel on my router appliance.  If I can still trigger the random crash after starting Snort up on a divert socket, then I'll rebuild again w/ the INVARIANTS option and see how that goes.
Comment 22 Joshua Kinard 2021-05-10 19:12:53 UTC
(In reply to Joshua Kinard from comment #21)
> I am running a rebuild now of the kernel on my router appliance.  If I
> can still trigger the random crash after starting Snort up on a divert
> socket, then I'll rebuild again w/ the INVARIANTS option and see how that
> goes.

So far, no crash yet.  I've been streaming a Youtube video on VXLANs for the past ~12mins across wireless with Snort running on a divert(4) socket, as well as traffic from my wired network, and the router has been up for ~29mins now.  This is the longest it's stayed online running 13.0-RELEASE under these conditions.  I will keep running my normal traffic load throughout the day to see if it stays up or eventually falls over.
Comment 23 Mark Johnston freebsd_committer freebsd_triage 2021-05-10 19:17:36 UTC
(In reply to Joshua Kinard from comment #22)
Thanks.  We'll release a patch for this in the next batch of security and errata updates for 13.0.
Comment 24 Jack 2021-05-10 19:27:37 UTC
Does this contain the patch?

13.0-STABLE FreeBSD 13.0-STABLE #0 stable/13-n245524-11af9a9cf930: Wed May  5 13:45:17 PDT 2021

uptime
12:27PM  up 4 days, 22:29, 1 user, load averages: 0.00, 0.01, 0.00

If so, the system has been stable for 4 days so far where it would be crashing every few hrs previously.
Comment 25 Mark Johnston freebsd_committer freebsd_triage 2021-05-10 19:30:03 UTC
(In reply to Jack from comment #24)
No.  Your crashes were caused/fixed by something else.  Can you share a few backtraces?
Comment 26 Michael Meiszl 2021-05-10 19:34:57 UTC
SNIFF! with INVARIANT the machine does not even survive for a minute anymore :-(

Reboot after 25s with autoloaded ipfw (again, no crash with manually started ipfw!)

Since INVARIANTS adds some more debug options, I assume, you like to see the crash details???

THIS might be the line you are looking for?

panic: Assertion m->m_nextpkt == NULL failed at /root/src/sys/net/iflib.c:4087

(I may add that my ipfw does not use fancy stuff like divert rules, just plain, simple, deny rules with addresses. Neither NAT, nor anything else extraordenary with this firewall)

but you are obviously on the right track already, it complains about m@entry not being available
_curthread () at /root/src/sys/amd64/include/pcpu_aux.h:55
55              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  __curthread () at /root/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=<optimized out>)
    at /root/src/sys/kern/kern_shutdown.c:399
#2  0xffffffff80bf580b in kern_reboot (howto=260)
    at /root/src/sys/kern/kern_shutdown.c:486
#3  0xffffffff80bf5c50 in vpanic (fmt=<optimized out>, ap=<optimized out>)
    at /root/src/sys/kern/kern_shutdown.c:919
#4  0xffffffff80bf59b3 in panic (fmt=<unavailable>)
    at /root/src/sys/kern/kern_shutdown.c:843
#5  0xffffffff80d29c5b in iflib_if_transmit (ifp=0xfffff80003dff800,
    m=0xfffff8005ce3ce00) at /root/src/sys/net/iflib.c:4087
#6  0xffffffff80d0fb9b in ether_output_frame (
    ifp=ifp@entry=0xfffff80003dff800, m=<unavailable>)
    at /root/src/sys/net/if_ethersubr.c:511
#7  0xffffffff80d0faa1 in ether_output (ifp=<optimized out>,
    ifp@entry=<error reading variable: value is not available>,
    m=<unavailable>,
    m@entry=<error reading variable: value is not available>,
    dst=0xfffffe003499c5a0,
    dst@entry=<error reading variable: value is not available>,
    ro=<optimized out>,
    ro@entry=<error reading variable: value is not available>)
    at /root/src/sys/net/if_ethersubr.c:438
#8  0xffffffff80da58ef in ip_output_send (inp=inp@entry=0x0,
    ifp=<unavailable>, ifp@entry=0xfffff80003dff800,
    m=m@entry=0xfffff8005ce3ce00, gw=gw@entry=0xfffffe003499c5a0,
    ro=<unavailable>, ro@entry=0x0, stamp_tag=<optimized out>)
    at /root/src/sys/netinet/ip_output.c:275
#9  0xffffffff80da55a5 in ip_output (m=0xfffff8005ce3ce00, m@entry=0x0,
    opt=opt@entry=0x0, ro=<optimized out>, ro@entry=0x0,
    flags=<optimized out>, flags@entry=0, imo=imo@entry=0x0,
    inp=<optimized out>, inp@entry=0x0)
    at /root/src/sys/netinet/ip_output.c:812
#10 0xffffffff80d92c59 in in_gif_output (ifp=ifp@entry=0xfffff80134802000,
    m=<optimized out>, m@entry=0xfffff8005cc87200, proto=<optimized out>,
    ecn=<optimized out>) at /root/src/sys/netinet/in_gif.c:306
#11 0xffffffff80d12350 in gif_transmit (ifp=0xfffff80134802000,
    m=0xfffff8005cc87200) at /root/src/sys/net/if_gif.c:380
#12 0xffffffff80df2b9b in ip6_forward (m=<unavailable>, srcrt=srcrt@entry=0)
    at /root/src/sys/netinet6/ip6_forward.c:387
#13 0xffffffff80df4414 in ip6_input (m=<unavailable>,
    m@entry=<error reading variable: value is not available>)
    at /root/src/sys/netinet6/ip6_input.c:897
#14 0xffffffff80d2cb11 in netisr_dispatch_src (proto=6,
    source=source@entry=0, m=0xfffff8005cc87200)
    at /root/src/sys/net/netisr.c:1143
#15 0xffffffff80d2ce5f in netisr_dispatch (proto=<unavailable>,
    m=<unavailable>) at /root/src/sys/net/netisr.c:1234
#16 0xffffffff80d0fd3e in ether_demux (ifp=ifp@entry=0xfffff80003dff800,
    m=<unavailable>) at /root/src/sys/net/if_ethersubr.c:923
#17 0xffffffff80d113cc in ether_input_internal (ifp=0xfffff80003dff800,
    m=<unavailable>) at /root/src/sys/net/if_ethersubr.c:709
#18 ether_nh_input (m=<optimized out>,
    m@entry=<error reading variable: value is not available>)
    at /root/src/sys/net/if_ethersubr.c:739
#19 0xffffffff80d2cb11 in netisr_dispatch_src (proto=proto@entry=5,
    source=source@entry=0, m=m@entry=0xfffff8005cc87200)
    at /root/src/sys/net/netisr.c:1143
#20 0xffffffff80d2ce5f in netisr_dispatch (proto=<unavailable>,
    proto@entry=5, m=<unavailable>, m@entry=0xfffff8005cc87200)
    at /root/src/sys/net/netisr.c:1234
#21 0xffffffff80d10231 in ether_input (ifp=0xfffff80003dff800,
    ifp@entry=<error reading variable: value is not available>,
    m=0xfffff8005cc87200,
    m@entry=<error reading variable: value is not available>)
    at /root/src/sys/net/if_ethersubr.c:830
#22 0xffffffff80d28bd7 in iflib_rxeof (rxq=<optimized out>,
    rxq@entry=0xfffff80003dcc000, budget=<optimized out>)
    at /root/src/sys/net/iflib.c:3006
#23 0xffffffff80d2274a in _task_fn_rx (context=0xfffff80003dcc000)
    at /root/src/sys/net/iflib.c:3949
#24 0xffffffff80c3ea77 in gtaskqueue_run_locked (
    queue=queue@entry=0xfffff80003988100)
    at /root/src/sys/kern/subr_gtaskqueue.c:371
#25 0xffffffff80c3e874 in gtaskqueue_thread_loop (
    arg=arg@entry=0xfffffe00379de008)
    at /root/src/sys/kern/subr_gtaskqueue.c:547
#26 0xffffffff80bb1f00 in fork_exit (
    callout=0xffffffff80c3e7e0 <gtaskqueue_thread_loop>,
    arg=0xfffffe00379de008, frame=0xfffffe003499cc00)
    at /root/src/sys/kern/kern_fork.c:1069
#27 <signal handler called>
(kgdb)
Comment 27 Joshua Kinard 2021-05-10 19:52:31 UTC
(In reply to Michael Meiszl from comment #26)
> (I may add that my ipfw does not use fancy stuff like divert rules, just
> plain, simple, deny rules with addresses. Neither NAT, nor anything else
> extraordenary with this firewall)

It is possible that you are running into a different bug that may or may not already be patched.  I am running standard 13.0-RELEASE kernel sources with the following patches applied on top:

  * https://reviews.freebsd.org/D29772 (Bug #255041)
  * https://reviews.freebsd.org/D29838 (Bug #255164)
  * https://reviews.freebsd.org/D30129 (this bug)

You might want to extract patches from all three reviews and apply them on top of a clean set of kernel sources, then rebuild everything and try again.
Comment 28 Mark Johnston freebsd_committer freebsd_triage 2021-05-10 20:03:59 UTC
(In reply to Michael Meiszl from comment #26)
Indeed, this looks like an unrelated bug, especially if you are not using divert sockets anywhere.  Please open a new bug report and I can suggest some additional things to narrow this down there.
Comment 29 Jack 2021-05-10 20:15:59 UTC
(In reply to Mark Johnston from comment #25)
Unfortunately I don't have any debugging turned on and I haven't been able to catch it when it crashes (remote computer).

I'll upgrade the system again and see if it experiences any crashing again.
Comment 30 Michael Meiszl 2021-05-11 05:33:34 UTC
(In reply to Mark Johnston from comment #28)
ok, done
Comment 31 commit-hook freebsd_committer freebsd_triage 2021-06-16 14:02:41 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=bc6a2267fffeafd3946637607a74cfd639398f9d

commit bc6a2267fffeafd3946637607a74cfd639398f9d
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2021-06-16 13:46:56 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2021-06-16 13:46:56 +0000

    ipfw: Update the pfil mbuf pointer in ipfw_check_frame()

    ipfw_chk() might call m_pullup() and thus can change the mbuf chain
    head.  In this case, the new chain head has to be returned to the pfil
    hook caller, otherwise the pfil hook caller is left with a dangling
    pointer.

    Note that this affects only the link-layer hooks installed when the
    net.link.ether.ipfw sysctl is set to 1.

    PR:             256439, 254015, 255069, 255104
    Fixes:          f355cb3e6
    Reviewed by:    ae
    MFC after:      3 days
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D30764

 sys/netpfil/ipfw/ip_fw_pfil.c | 2 ++
 1 file changed, 2 insertions(+)
Comment 32 Mark Johnston freebsd_committer freebsd_triage 2021-06-16 14:55:46 UTC
(In reply to commit-hook from comment #31)
Actually based on comment 22 the original bug has already been fixed.  We released an EN for it in FreeBSD-EN-21:12.divert.

I tagged this PR based on comment 12, but I see now that that panic is apparently unrelated to setting net.link.ether.ipfw=1.  So a new PR is needed if that panic is still occurring.

With respect to comment 30, the new PR is 255775.
Comment 33 commit-hook freebsd_committer freebsd_triage 2021-06-19 14:09:38 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=ed1acef3fe3053b418ce3e41036ccf24957253a4

commit ed1acef3fe3053b418ce3e41036ccf24957253a4
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2021-06-16 13:46:56 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2021-06-19 14:08:49 +0000

    ipfw: Update the pfil mbuf pointer in ipfw_check_frame()

    ipfw_chk() might call m_pullup() and thus can change the mbuf chain
    head.  In this case, the new chain head has to be returned to the pfil
    hook caller, otherwise the pfil hook caller is left with a dangling
    pointer.

    Note that this affects only the link-layer hooks installed when the
    net.link.ether.ipfw sysctl is set to 1.

    PR:             256439, 254015, 255069, 255104
    Fixes:          f355cb3e6
    Reviewed by:    ae
    Sponsored by:   The FreeBSD Foundation

    (cherry picked from commit bc6a2267fffeafd3946637607a74cfd639398f9d)

 sys/netpfil/ipfw/ip_fw_pfil.c | 2 ++
 1 file changed, 2 insertions(+)
Comment 34 Kubilay Kocak freebsd_committer freebsd_triage 2021-06-26 02:50:25 UTC

*** This bug has been marked as a duplicate of bug 254015 ***
Comment 35 commit-hook freebsd_committer freebsd_triage 2021-06-29 20:24:22 UTC
A commit in branch releng/13.0 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=4647d115ff849534c9d6712cc2da32509721e20e

commit 4647d115ff849534c9d6712cc2da32509721e20e
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2021-06-16 13:46:56 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2021-06-29 17:09:43 +0000

    ipfw: Update the pfil mbuf pointer in ipfw_check_frame()

    ipfw_chk() might call m_pullup() and thus can change the mbuf chain
    head.  In this case, the new chain head has to be returned to the pfil
    hook caller, otherwise the pfil hook caller is left with a dangling
    pointer.

    Note that this affects only the link-layer hooks installed when the
    net.link.ether.ipfw sysctl is set to 1.

    Approved by:    so
    Security:       EN-21:21.ipfw
    PR:             256439, 254015, 255069, 255104
    Fixes:          f355cb3e6
    Reviewed by:    ae
    Sponsored by:   The FreeBSD Foundation

    (cherry picked from commit bc6a2267fffeafd3946637607a74cfd639398f9d)
    (cherry picked from commit ed1acef3fe3053b418ce3e41036ccf24957253a4)

 sys/netpfil/ipfw/ip_fw_pfil.c | 2 ++
 1 file changed, 2 insertions(+)