After recompiling the 11.2 code that led to kevent crashes (<https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=234296#c31>) natively on 12.0-RELEASE-p3, after running for 30 minutes to an hour the code again triggered a kernel panic, but this time not in kevent: panic: vm_fault_hold: fault on nofault entry, addr: 0xfffffe00c9f87000 cpuid = 0 time = 1553747701 KDB: stack backtrace: #0 0xffffffff80be7977 at kdb_backtrace+0x67 #1 0xffffffff80b9b563 at vpanic+0x1a3 #2 0xffffffff80b9b3b3 at panic+0x43 #3 0xffffffff80edd120 at unlock_and_deallocate+0 #4 0xffffffff80eda970 at vm_fault+0x60 #5 0xffffffff81074ae3 at trap_pfault+0x163 #6 0xffffffff81073fee at trap+0x29e #7 0xffffffff8104f465 at calltrap+0x8 #8 0xffffffff80d26cdd at ip_input+0x45d #9 0xffffffff80cbc576 at netisr_dispatch_src+0xd6 #10 0xffffffff80ca0e63 at ether_demux+0x163 #11 0xffffffff80ca1fc6 at ether_nh_input+0x346 #12 0xffffffff80cbc576 at netisr_dispatch_src+0xd6 #13 0xffffffff80ca1264 at ether_input+0x54 #14 0xffffffff80cb8726 at iflib_rxeof+0xa16 #15 0xffffffff80cb3556 at _task_fn_rx+0x76 #16 0xffffffff80be6204 at gtaskqueue_run_locked+0x144 #17 0xffffffff80be5e68 at gtaskqueue_thread_loop+0x98 This time I have a crash dump. And, FWIW: $ addr2line -afi -e /usr/lib/debug/boot/kernel/kernel.debug 0xffffffff80d26cdd 0xffffffff80d26cdd ip_input /usr/src/sys/netinet/ip_input.c:605 From kgdb: (kgdb) fr 28 #28 0xffffffff80d26cdd in ip_input (m=0xfffff80111e4ec00) at /usr/src/sys/netinet/ip_input.c:605 605 if (pfil_run_hooks(&V_inet_pfil_hook, &m, ifp, PFIL_IN, 0, NULL) != 0) (kgdb) p *m $2 = {{m_next = 0x0, m_slist = {sle_next = 0x0}, m_stailq = {stqe_next = 0x0}}, {m_nextpkt = 0x0, m_slistpkt = {sle_next = 0x0}, m_stailqpkt = {stqe_next = 0x0}}, m_data = 0xfffff8051f18900e "E", m_len = 420, m_type = 1, m_flags = 3, {{m_pkthdr = {{ snd_tag = 0xfffff80003d1e000, rcvif = 0xfffff80003d1e000}, tags = {slh_first = 0x0}, len = 420, flowid = 2776446732, csum_flags = 251658240, fibnum = 0, cosqos = 0 '\000', rsstype = 63 '?', {rcv_tstmp = 0, {l2hlen = 0 '\000', l3hlen = 0 '\000', l4hlen = 0 '\000', l5hlen = 0 '\000', spare = 0}}, PH_per = { eight = "\000\000\000\000\377\377\000", sixteen = {0, 0, 65535, 0}, thirtytwo = {0, 65535}, sixtyfour = { 281470681743360}, unintptr = {281470681743360}, ptr = 0xffff00000000}, PH_loc = { eight = "\000\000\000\000\000\000\000", sixteen = {0, 0, 0, 0}, thirtytwo = {0, 0}, sixtyfour = {0}, unintptr = {0}, ptr = 0x0}}, {m_ext = {{ext_count = 1, ext_cnt = 0x5443454c00000001}, ext_buf = 0xfffff8051f189000 "\f\304z\340H\250\\E'tD\306\b", ext_size = 2048, ext_type = 1, ext_flags = 1, ext_free = 0x0, ext_arg1 = 0x0, ext_arg2 = 0x0}, m_pktdat = 0xfffff80111e4ec58 "\001"}}, m_dat = 0xfffff80111e4ec20 ""}} (kgdb) p *ifp $3 = {if_link = {cstqe_next = 0xfffff80111e4ec00}, if_clones = {le_next = 0x1, le_prev = 0x38}, if_groups = {cstqh_first = 0x1, cstqh_last = 0xfffff80003792000}, if_alloctype = 0 '\000', if_softc = 0xfffffe0075df26b0, if_llsoftc = 0xffffffff80cbc576 <netisr_dispatch_src+214>, if_l2com = 0xe74d00, if_dname = 0xffffffff80e71134 <mac_ifnet_create_mbuf+292> "\200<%=\020\240\201", if_dunit = -2113854840, if_index = 65535, if_index_reserved = -1, if_xname = "\000\b\000\000\000\000\000\000\000 y\003", <incomplete sequence \370\377\377>, if_description = 0x8 <error: Cannot access memory at address 0x8>, if_flags = 64086016, if_drv_flags = -2048, if_capabilities = 64086016, if_capenable = -2048, if_linkmib = 0xfffffe0075df26e0, if_linkmiblen = 18446744071575309923, if_refcount = 58269696, if_type = 0 '\000', if_addrlen = 248 '\370', if_hdrlen = 255 '\377', if_link_state = 255 '\377', if_mtu = 300215296, if_metric = 4294965249, if_baudrate = 18446735282211712000, if_hwassist = 18446735299613069312, if_epoch = -2197045696704, if_lastchange = {tv_sec = -2134237242, tv_usec = 512}, if_snd = {ifq_head = 0x7, ifq_tail = 0xfffffe0075df27c0, ifq_len = 50907712, ifq_maxlen = -2048, ifq_mtx = {lock_object = { lo_name = 0xfffff80111e4ec00 "", lo_flags = 5, lo_data = 0, lo_witness = 0x118}, mtx_lock = 5}, ifq_drv_head = 0xfffff80003792000, ifq_drv_tail = 0x0, ifq_drv_len = 1977558928, ifq_drv_maxlen = -512, altq_type = -2134129290, altq_flags = -1, altq_disc = 0xe74d00, altq_ifp = 0x0, altq_enqueue = 0x175df27c0, altq_dequeue = 0xfffff80003792000, altq_request = 0x0, altq_clfier = 0xfffff80111e4ec00, altq_classify = 0xfffff80003d1e000, altq_tbr = 0x0, altq_cdnr = 0xfffffe0075df27c0}, if_linktask = {ta_link = {stqe_next = 0xffffffff80ca1264 <ether_input+84>}, ta_pending = 0, ta_priority = 0, ta_func = 0x1b2, ta_context = 0xfffff80003d1e000}, if_addr_lock = {lock_object = { lo_name = 0x1 <error: Cannot access memory at address 0x1>, lo_flags = 1977559200, lo_data = 4294966784, lo_witness = 0xffffffff80cb8726 <iflib_rxeof+2582>}, mtx_lock = 18446741877785532224}, if_addrhead = { cstqh_first = 0xfffffe00b8ba7740, cstqh_last = 0xfffff80003d49800}, if_multiaddrs = {cstqh_first = 0xffffffffffff00e8, cstqh_last = 0xfffff80003d3e140}, if_amcount = 64264192, if_addr = 0xfffff80003d13000, if_hw_addr = 0xe801b200000000, if_broadcastaddr = 0xfffff80003d1e000 "", if_afdata_lock = {lock_object = {lo_name = 0xfffff80003d3e140 "", lo_flags = 2776446732, lo_data = 251658240, lo_witness = 0x3f01000000ffff}, mtx_lock = 18446735281926513849}, if_afdata = { 0xfffff8017eaaec01, 0xfffff80003d3e030, 0x18ffffffff, 0xfffff80003d3e000, 0xffffffff81a76540 <igb_sctx_init>, 0xfffff80003d1e000, 0xfffff801000001b2, 0x0, 0xfffff80003784000, 0xfffff80003d13000, 0xfffffe0075df2908, 0xfffff80003d3e000, 0xfffff80003784050, 0xfffffe0075df28e0, 0xffffffff80cb3556 <_task_fn_rx+118>, 0x0, 0xfffff80003784000, 0xfffff80003784000, 0xfffff80003d3e090, 0xfffffe0075df2900, 0xfffff80003784050, 0xfffffe0075df2940, 0xffffffff80be6204 <gtaskqueue_run_locked+324>, 0xfffffe0075df2940, 0xfffff80003784038, 0xfffff80003d3e090, 0x0, 0xfffff80003784028, 0xfffff80003784038, 0xfffffe00041fd008, 0xffffffff81fe62e0 <proc0>, 0xfffff80003784000, 0xffffffff80be5dd0 <gtaskqueue_thread_loop>, 0xfffffe0075df2970, 0xffffffff80be5e68 <gtaskqueue_thread_loop+152>, 0xfffffe0075df2960, 0x202, 0xfffff80003792000, 0xfffffe0075df29c0, 0xfffffe0075df29b0, 0xffffffff80b5bf33 <fork_exit+131>, 0x0}, if_afdata_initialized = 69193736, if_fib = 4294966784, if_vnet = 0xffffffff80be5dd0 <gtaskqueue_thread_loop>, if_home_vnet = 0x0, if_vlantrunk = 0xffffffff81ea6300 <tdq_cpu>, if_bpf = 0xffffffff81fe6820 <thread0_st>, if_pcount = 0, if_bridge = 0xffffffff8105045e <fork_trampoline+14>, if_lagg = 0x0, if_pf_kif = 0x0, if_carp = 0x0, if_label = 0x0, if_netmap = 0x0, if_output = 0x0, if_input = 0x0, if_bridge_input = 0x0, if_bridge_output = 0x0, if_bridge_linkstate = 0x0, if_start = 0x0, if_ioctl = 0x0, if_init = 0x0, if_resolvemulti = 0x0, if_qflush = 0x0, if_transmit = 0x0, if_reassign = 0x0, if_get_counter = 0x0, if_requestencap = 0x0, if_counters = {0x0, 0x0, 0x0, 0x0, 0x0, 0xfffff80003792000, 0xffffffff81f74688 <sleepq_chains+4104>, 0x0, 0x0, 0xfffffe0075df2890, 0xfffffe0075df27c8, 0xfffff800036db000}, if_hw_tsomax = 2159857853, if_hw_tsomaxsegcount = 4294967295, if_hw_tsomaxsegsize = 0, if_snd_tag_alloc = 0x0, if_snd_tag_modify = 0x0, if_snd_tag_query = 0x0, if_snd_tag_free = 0x0, if_pcp = 0 '\000', if_netdump_methods = 0x0, if_epoch_ctx = {data = {0x0, 0x0}}, if_addr_et = {datap = {0x0, 0x0, 0x0}, datai = {0}}, if_maddr_et = {datap = {0x0, 0x0, 0x0}, datai = {0}}, if_ispare = {1, 0, 0, 0}}
I switched to the 12.0-STABLE kernel, but still essentially the same panic: panic: vm_fault_hold: fault on nofault entry, addr: 0xfffffe0075df2000 cpuid = 6 time = 1553772208 KDB: stack backtrace: #0 0xffffffff80bf1907 at kdb_backtrace+0x67 #1 0xffffffff80ba58c3 at vpanic+0x1a3 #2 0xffffffff80ba5713 at panic+0x43 #3 0xffffffff80ee9ed1 at vm_fault_hold+0x2801 #4 0xffffffff80ee7680 at vm_fault+0x60 #5 0xffffffff81082a92 at trap_pfault+0x162 #6 0xffffffff81081f6e at trap+0x29e #7 0xffffffff8105c715 at calltrap+0x8 #8 0xffffffff80d32d73 at ip_input+0x463 #9 0xffffffff80cc70f6 at netisr_dispatch_src+0xd6 #10 0xffffffff80cab643 at ether_demux+0x163 #11 0xffffffff80cac7a6 at ether_nh_input+0x346 #12 0xffffffff80cc70f6 at netisr_dispatch_src+0xd6 #13 0xffffffff80caba44 at ether_input+0x54 #14 0xffffffff80cc3836 at iflib_rxeof+0xa66 #15 0xffffffff80cbdd06 at _task_fn_rx+0x76 #16 0xffffffff80bf0194 at gtaskqueue_run_locked+0x144 #17 0xffffffff80befdf8 at gtaskqueue_thread_loop+0x98 Uptime: 5h20m55s addr2line -afi -e /usr/lib/debug/boot/kernel/kernel.debug 0xffffffff80d32d73 0xffffffff80d32d73 ip_input /usr/src/sys/netinet/ip_input.c:606 The kgdb backtrace is: #0 __curthread () at ./machine/pcpu.h:230 #1 doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:371 #2 0xffffffff80ba54ab in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:451 #3 0xffffffff80ba5923 in vpanic (fmt=<optimized out>, ap=0xfffffe000045c130) at /usr/src/sys/kern/kern_shutdown.c:877 #4 0xffffffff80ba5713 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:804 #5 0xffffffff80ee9ed1 in vm_fault_hold (map=<optimized out>, vaddr=18446741876663853056, fault_type=<optimized out>, fault_flags=<optimized out>, m_hold=0x0) at /usr/src/sys/vm/vm_fault.c:598 #6 0xffffffff80ee7680 in vm_fault (map=0xfffff80003002000, vaddr=<optimized out>, fault_type=4 '\004', fault_flags=0) at /usr/src/sys/vm/vm_fault.c:548 #7 0xffffffff81082a92 in trap_pfault (frame=0xfffffe000045c460, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:844 #8 0xffffffff81081f6e in trap (frame=0xfffffe000045c460) at /usr/src/sys/amd64/amd64/trap.c:441 #9 <signal handler called> #10 0xfffffe0075df2450 in ?? () #11 0xffffffff8201a908 in __pcpu () #12 0xffffffff8201a908 in __pcpu () #13 0xfffffe0000640208 in ?? () #14 0xfffff8000377b000 in ?? () #15 0xfffff80100000000 in ?? () #16 0x0000000000000034 in ?? () #17 0xfffff8000377b000 in ?? () #18 0xfffffe000045c630 in ?? () #19 0xfffffe00006489f8 in ?? () #20 0x0000000000000008 in ?? () #21 0xfffff804d4d6c100 in ?? () #22 0x0000000065270264 in ?? () #23 0xfffff8000377b000 in ?? () #24 0x0000000000000014 in ?? () #25 0xffffffff82014880 in ?? () #26 0xfffff804d4d6c166 in ?? () #27 0xfffffe000045c660 in ?? () #28 0xffffffff80d32d73 in ip_input (m=0xfffff804d4d6c100) at /usr/src/sys/netinet/ip_input.c:606 Backtrace stopped: frame did not save the PC
With fantastic help from Mark Johnston, the issue has been plausibly narrowed down to the new epoch_tracker code in 12.0 not handling IPv6 via stf0 robustly. The crash dump with "option INVARIANTS" looks like: panic: Assertion etd->et_td == (struct thread *)td failed at /usr/src/sys/sys/epoch_private.h:178 cpuid = 7 time = 1553806077 KDB: stack backtrace: #0 0xffffffff80bbb047 at kdb_backtrace+0x67 #1 0xffffffff80b737c3 at vpanic+0x1a3 #2 0xffffffff80b735a3 at panic+0x43 #3 0xffffffff80c68357 at epoch_exit_preempt+0x1e7 #4 0xffffffff80c6b35d at if_addr_runlock+0x1d #5 0xffffffff82e33b67 at stf_getsrcifa6+0x1a7 #6 0xffffffff82e33488 at stf_output+0x48 #7 0xffffffff80dc356f at ip6_output+0x1daf #8 0xffffffff80d85d74 at tcp_output+0x1be4 #9 0xffffffff80d97739 at tcp_usr_disconnect+0x69 #10 0xffffffff80c07eec at soclose+0x9c #11 0xffffffff80b1d86a at _fdrop+0x1a #12 0xffffffff80b20abc at closef+0x1ec #13 0xffffffff80b1dea0 at closefp+0xa0 #14 0xffffffff8103ca58 at amd64_syscall+0x278 #15 0xffffffff8101589d at fast_syscall_common+0x101 I am now testing a work-around patch...
The patch appears to have stabilized the system. No panic yet with workloads similar to those that trigger panics in stock 12.0 kernels. If it holds up for a few more days, I'll declare victory.
For reference, the patch is here: https://people.freebsd.org/~markj/patches/stable_12_epoch.diff Written by HPS, it fixes the same problem as r340413 but without breaking the KBI. IMO we should commit it directly to stable/12.
Trimming CCs.
My system continues to be stable, and perform well. This ticket may not need to stay open, unless that's useful to ensure that the patch gets merged into 12.0.
A commit references this bug: Author: markj Date: Mon Apr 1 14:19:10 UTC 2019 New revision: 345764 URL: https://svnweb.freebsd.org/changeset/base/345764 Log: Fix if_(m)addr_rlock(). The use of a per-ifnet epoch context meant that these KPIs were not reentrant. This was fixed in head in r340413, but the change cannot be MFCed because it breaks the KBI by modifying struct thread. This is a direct commit to stable/12 which uses a per-CPU mutex to fix the problem without changing the KBI. PR: 236846 Submitted by: hselasky Reported and tested by: Viktor Dukhovni <ietf-dane@dukhovni.org> Reviewed by: hselasky (previous version) Differential Revision: https://reviews.freebsd.org/D19764 Changes: stable/12/sys/net/if.c stable/12/sys/net/if_var.h
(In reply to Viktor Dukhovni from comment #6) The change is now in stable/12. I'll keep the bug report open until we get this into 12.0.
This is now fixed in 12.0 with an EN: https://svnweb.freebsd.org/changeset/base/349198
Quick question, if I may, I am new to FreeBSD and not familiar with what an "EN" is. What do I need to do to deploy the "EN" code? Is that available now, when 12.1 ships, or at some intermediate checkpoint? Or is the idea that I switch by svn src from "stable" back to the releng branch and build that?
(In reply to Viktor Dukhovni from comment #10) Your next action depends on the version of -stable you are running. The fix for stable was in r345764. If you are after that version, you're fine. If you run stable from before that version, upgrade to the latest -stable.
(In reply to Viktor Dukhovni from comment #10) It is available as a binary patch for 12.0, via freebsd-update. I should have linked to the errata advisory instead, which provides instructions for updating: https://www.freebsd.org/security/advisories/FreeBSD-EN-19:11.net.asc
Thanks! Yes, that link makes it all clear. I would prefer to switch back to releng and use freebsd-update, so I may do that soon. Do you know whether the stf "tentative" fix is also in "releng"? Or is that still only in stable? r346434 | markj | 2019-04-20 07:01:47 -0400 (Sat, 20 Apr 2019) | 3 lines MFC r345739: Do not perform DAD on stf(4) interfaces.
(In reply to Viktor Dukhovni from comment #13) That is only available in stable. It doesn't really qualify for an EN given its relative severity and the fact that it does not appear to be a regression. It will be included with 12.1, which is due to be released at the beginning of November.