Bug 236846 - FreeBSD 12.0-STABLE-p3 r345567: stf_output panic in epoch_exit_preempt
Summary: FreeBSD 12.0-STABLE-p3 r345567: stf_output panic in epoch_exit_preempt
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.0-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: Mark Johnston
URL:
Keywords: panic
Depends on:
Blocks:
 
Reported: 2019-03-28 05:26 UTC by Viktor Dukhovni
Modified: 2019-06-28 14:06 UTC (History)
7 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Viktor Dukhovni 2019-03-28 05:26:01 UTC
After recompiling the 11.2 code that led to kevent crashes
(<https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=234296#c31>)
natively on 12.0-RELEASE-p3, after running for 30 minutes to an
hour the code again triggered a kernel panic, but this time not
in kevent:

panic: vm_fault_hold: fault on nofault entry, addr: 0xfffffe00c9f87000
cpuid = 0
time = 1553747701
KDB: stack backtrace:
#0 0xffffffff80be7977 at kdb_backtrace+0x67
#1 0xffffffff80b9b563 at vpanic+0x1a3
#2 0xffffffff80b9b3b3 at panic+0x43
#3 0xffffffff80edd120 at unlock_and_deallocate+0
#4 0xffffffff80eda970 at vm_fault+0x60
#5 0xffffffff81074ae3 at trap_pfault+0x163
#6 0xffffffff81073fee at trap+0x29e
#7 0xffffffff8104f465 at calltrap+0x8
#8 0xffffffff80d26cdd at ip_input+0x45d
#9 0xffffffff80cbc576 at netisr_dispatch_src+0xd6
#10 0xffffffff80ca0e63 at ether_demux+0x163
#11 0xffffffff80ca1fc6 at ether_nh_input+0x346
#12 0xffffffff80cbc576 at netisr_dispatch_src+0xd6
#13 0xffffffff80ca1264 at ether_input+0x54
#14 0xffffffff80cb8726 at iflib_rxeof+0xa16
#15 0xffffffff80cb3556 at _task_fn_rx+0x76
#16 0xffffffff80be6204 at gtaskqueue_run_locked+0x144
#17 0xffffffff80be5e68 at gtaskqueue_thread_loop+0x98

This time I have a crash dump.  And, FWIW:

  $ addr2line -afi -e /usr/lib/debug/boot/kernel/kernel.debug 0xffffffff80d26cdd
  0xffffffff80d26cdd
  ip_input
  /usr/src/sys/netinet/ip_input.c:605

From kgdb:

(kgdb) fr 28       
#28 0xffffffff80d26cdd in ip_input (m=0xfffff80111e4ec00) at /usr/src/sys/netinet/ip_input.c:605
605             if (pfil_run_hooks(&V_inet_pfil_hook, &m, ifp, PFIL_IN, 0, NULL) != 0)

(kgdb) p *m
$2 = {{m_next = 0x0, m_slist = {sle_next = 0x0}, m_stailq = {stqe_next = 0x0}}, {m_nextpkt = 0x0, m_slistpkt = {sle_next = 0x0}, 
    m_stailqpkt = {stqe_next = 0x0}}, m_data = 0xfffff8051f18900e "E", m_len = 420, m_type = 1, m_flags = 3, {{m_pkthdr = {{
          snd_tag = 0xfffff80003d1e000, rcvif = 0xfffff80003d1e000}, tags = {slh_first = 0x0}, len = 420, flowid = 2776446732, 
        csum_flags = 251658240, fibnum = 0, cosqos = 0 '\000', rsstype = 63 '?', {rcv_tstmp = 0, {l2hlen = 0 '\000', 
            l3hlen = 0 '\000', l4hlen = 0 '\000', l5hlen = 0 '\000', spare = 0}}, PH_per = {
          eight = "\000\000\000\000\377\377\000", sixteen = {0, 0, 65535, 0}, thirtytwo = {0, 65535}, sixtyfour = {
            281470681743360}, unintptr = {281470681743360}, ptr = 0xffff00000000}, PH_loc = {
          eight = "\000\000\000\000\000\000\000", sixteen = {0, 0, 0, 0}, thirtytwo = {0, 0}, sixtyfour = {0}, unintptr = {0}, 
          ptr = 0x0}}, {m_ext = {{ext_count = 1, ext_cnt = 0x5443454c00000001}, 
          ext_buf = 0xfffff8051f189000 "\f\304z\340H\250\\E'tD\306\b", ext_size = 2048, ext_type = 1, ext_flags = 1, 
          ext_free = 0x0, ext_arg1 = 0x0, ext_arg2 = 0x0}, m_pktdat = 0xfffff80111e4ec58 "\001"}}, 
    m_dat = 0xfffff80111e4ec20 ""}}

(kgdb) p *ifp
$3 = {if_link = {cstqe_next = 0xfffff80111e4ec00}, if_clones = {le_next = 0x1, le_prev = 0x38}, if_groups = {cstqh_first = 0x1,
    cstqh_last = 0xfffff80003792000}, if_alloctype = 0 '\000', if_softc = 0xfffffe0075df26b0,
  if_llsoftc = 0xffffffff80cbc576 <netisr_dispatch_src+214>, if_l2com = 0xe74d00,
  if_dname = 0xffffffff80e71134 <mac_ifnet_create_mbuf+292> "\200<%=\020\240\201", if_dunit = -2113854840, if_index = 65535,
  if_index_reserved = -1, if_xname = "\000\b\000\000\000\000\000\000\000 y\003", <incomplete sequence \370\377\377>,
  if_description = 0x8 <error: Cannot access memory at address 0x8>, if_flags = 64086016, if_drv_flags = -2048,
  if_capabilities = 64086016, if_capenable = -2048, if_linkmib = 0xfffffe0075df26e0, if_linkmiblen = 18446744071575309923,
  if_refcount = 58269696, if_type = 0 '\000', if_addrlen = 248 '\370', if_hdrlen = 255 '\377', if_link_state = 255 '\377',
  if_mtu = 300215296, if_metric = 4294965249, if_baudrate = 18446735282211712000, if_hwassist = 18446735299613069312,
  if_epoch = -2197045696704, if_lastchange = {tv_sec = -2134237242, tv_usec = 512}, if_snd = {ifq_head = 0x7,
    ifq_tail = 0xfffffe0075df27c0, ifq_len = 50907712, ifq_maxlen = -2048, ifq_mtx = {lock_object = {
        lo_name = 0xfffff80111e4ec00 "", lo_flags = 5, lo_data = 0, lo_witness = 0x118}, mtx_lock = 5}, 
    ifq_drv_head = 0xfffff80003792000, ifq_drv_tail = 0x0, ifq_drv_len = 1977558928, ifq_drv_maxlen = -512, 
    altq_type = -2134129290, altq_flags = -1, altq_disc = 0xe74d00, altq_ifp = 0x0, altq_enqueue = 0x175df27c0, 
    altq_dequeue = 0xfffff80003792000, altq_request = 0x0, altq_clfier = 0xfffff80111e4ec00, altq_classify = 0xfffff80003d1e000, 
    altq_tbr = 0x0, altq_cdnr = 0xfffffe0075df27c0}, if_linktask = {ta_link = {stqe_next = 0xffffffff80ca1264 <ether_input+84>}, 
    ta_pending = 0, ta_priority = 0, ta_func = 0x1b2, ta_context = 0xfffff80003d1e000}, if_addr_lock = {lock_object = {
      lo_name = 0x1 <error: Cannot access memory at address 0x1>, lo_flags = 1977559200, lo_data = 4294966784, 
      lo_witness = 0xffffffff80cb8726 <iflib_rxeof+2582>}, mtx_lock = 18446741877785532224}, if_addrhead = {
    cstqh_first = 0xfffffe00b8ba7740, cstqh_last = 0xfffff80003d49800}, if_multiaddrs = {cstqh_first = 0xffffffffffff00e8, 
    cstqh_last = 0xfffff80003d3e140}, if_amcount = 64264192, if_addr = 0xfffff80003d13000, if_hw_addr = 0xe801b200000000, 
  if_broadcastaddr = 0xfffff80003d1e000 "", if_afdata_lock = {lock_object = {lo_name = 0xfffff80003d3e140 "", 
      lo_flags = 2776446732, lo_data = 251658240, lo_witness = 0x3f01000000ffff}, mtx_lock = 18446735281926513849}, if_afdata = {
    0xfffff8017eaaec01, 0xfffff80003d3e030, 0x18ffffffff, 0xfffff80003d3e000, 0xffffffff81a76540 <igb_sctx_init>, 
    0xfffff80003d1e000, 0xfffff801000001b2, 0x0, 0xfffff80003784000, 0xfffff80003d13000, 0xfffffe0075df2908, 0xfffff80003d3e000, 
    0xfffff80003784050, 0xfffffe0075df28e0, 0xffffffff80cb3556 <_task_fn_rx+118>, 0x0, 0xfffff80003784000, 0xfffff80003784000, 
    0xfffff80003d3e090, 0xfffffe0075df2900, 0xfffff80003784050, 0xfffffe0075df2940, 
    0xffffffff80be6204 <gtaskqueue_run_locked+324>, 0xfffffe0075df2940, 0xfffff80003784038, 0xfffff80003d3e090, 0x0,
    0xfffff80003784028, 0xfffff80003784038, 0xfffffe00041fd008, 0xffffffff81fe62e0 <proc0>, 0xfffff80003784000,
    0xffffffff80be5dd0 <gtaskqueue_thread_loop>, 0xfffffe0075df2970, 0xffffffff80be5e68 <gtaskqueue_thread_loop+152>,
    0xfffffe0075df2960, 0x202, 0xfffff80003792000, 0xfffffe0075df29c0, 0xfffffe0075df29b0, 0xffffffff80b5bf33 <fork_exit+131>,
    0x0}, if_afdata_initialized = 69193736, if_fib = 4294966784, if_vnet = 0xffffffff80be5dd0 <gtaskqueue_thread_loop>,
  if_home_vnet = 0x0, if_vlantrunk = 0xffffffff81ea6300 <tdq_cpu>, if_bpf = 0xffffffff81fe6820 <thread0_st>, if_pcount = 0,
  if_bridge = 0xffffffff8105045e <fork_trampoline+14>, if_lagg = 0x0, if_pf_kif = 0x0, if_carp = 0x0, if_label = 0x0,
  if_netmap = 0x0, if_output = 0x0, if_input = 0x0, if_bridge_input = 0x0, if_bridge_output = 0x0, if_bridge_linkstate = 0x0,
  if_start = 0x0, if_ioctl = 0x0, if_init = 0x0, if_resolvemulti = 0x0, if_qflush = 0x0, if_transmit = 0x0, if_reassign = 0x0,
  if_get_counter = 0x0, if_requestencap = 0x0, if_counters = {0x0, 0x0, 0x0, 0x0, 0x0, 0xfffff80003792000,
    0xffffffff81f74688 <sleepq_chains+4104>, 0x0, 0x0, 0xfffffe0075df2890, 0xfffffe0075df27c8, 0xfffff800036db000},
  if_hw_tsomax = 2159857853, if_hw_tsomaxsegcount = 4294967295, if_hw_tsomaxsegsize = 0, if_snd_tag_alloc = 0x0,
  if_snd_tag_modify = 0x0, if_snd_tag_query = 0x0, if_snd_tag_free = 0x0, if_pcp = 0 '\000', if_netdump_methods = 0x0,
  if_epoch_ctx = {data = {0x0, 0x0}}, if_addr_et = {datap = {0x0, 0x0, 0x0}, datai = {0}}, if_maddr_et = {datap = {0x0, 0x0,
      0x0}, datai = {0}}, if_ispare = {1, 0, 0, 0}}
Comment 1 Viktor Dukhovni 2019-03-28 15:35:57 UTC
I switched to the 12.0-STABLE kernel, but still essentially the same panic:

panic: vm_fault_hold: fault on nofault entry, addr: 0xfffffe0075df2000
cpuid = 6
time = 1553772208
KDB: stack backtrace:
#0 0xffffffff80bf1907 at kdb_backtrace+0x67
#1 0xffffffff80ba58c3 at vpanic+0x1a3
#2 0xffffffff80ba5713 at panic+0x43
#3 0xffffffff80ee9ed1 at vm_fault_hold+0x2801
#4 0xffffffff80ee7680 at vm_fault+0x60
#5 0xffffffff81082a92 at trap_pfault+0x162
#6 0xffffffff81081f6e at trap+0x29e
#7 0xffffffff8105c715 at calltrap+0x8
#8 0xffffffff80d32d73 at ip_input+0x463
#9 0xffffffff80cc70f6 at netisr_dispatch_src+0xd6
#10 0xffffffff80cab643 at ether_demux+0x163
#11 0xffffffff80cac7a6 at ether_nh_input+0x346
#12 0xffffffff80cc70f6 at netisr_dispatch_src+0xd6
#13 0xffffffff80caba44 at ether_input+0x54
#14 0xffffffff80cc3836 at iflib_rxeof+0xa66
#15 0xffffffff80cbdd06 at _task_fn_rx+0x76
#16 0xffffffff80bf0194 at gtaskqueue_run_locked+0x144
#17 0xffffffff80befdf8 at gtaskqueue_thread_loop+0x98
Uptime: 5h20m55s


addr2line -afi -e /usr/lib/debug/boot/kernel/kernel.debug 0xffffffff80d32d73
0xffffffff80d32d73
ip_input
/usr/src/sys/netinet/ip_input.c:606

The kgdb backtrace is:

#0  __curthread () at ./machine/pcpu.h:230
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:371
#2  0xffffffff80ba54ab in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:451
#3  0xffffffff80ba5923 in vpanic (fmt=<optimized out>, ap=0xfffffe000045c130) at /usr/src/sys/kern/kern_shutdown.c:877
#4  0xffffffff80ba5713 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:804
#5  0xffffffff80ee9ed1 in vm_fault_hold (map=<optimized out>, vaddr=18446741876663853056, fault_type=<optimized out>, 
    fault_flags=<optimized out>, m_hold=0x0) at /usr/src/sys/vm/vm_fault.c:598
#6  0xffffffff80ee7680 in vm_fault (map=0xfffff80003002000, vaddr=<optimized out>, fault_type=4 '\004', fault_flags=0)
    at /usr/src/sys/vm/vm_fault.c:548
#7  0xffffffff81082a92 in trap_pfault (frame=0xfffffe000045c460, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:844
#8  0xffffffff81081f6e in trap (frame=0xfffffe000045c460) at /usr/src/sys/amd64/amd64/trap.c:441
#9  <signal handler called>
#10 0xfffffe0075df2450 in ?? ()
#11 0xffffffff8201a908 in __pcpu ()
#12 0xffffffff8201a908 in __pcpu ()
#13 0xfffffe0000640208 in ?? ()
#14 0xfffff8000377b000 in ?? ()
#15 0xfffff80100000000 in ?? ()
#16 0x0000000000000034 in ?? ()
#17 0xfffff8000377b000 in ?? ()
#18 0xfffffe000045c630 in ?? ()
#19 0xfffffe00006489f8 in ?? ()
#20 0x0000000000000008 in ?? ()
#21 0xfffff804d4d6c100 in ?? ()
#22 0x0000000065270264 in ?? ()
#23 0xfffff8000377b000 in ?? ()
#24 0x0000000000000014 in ?? ()
#25 0xffffffff82014880 in ?? ()
#26 0xfffff804d4d6c166 in ?? ()
#27 0xfffffe000045c660 in ?? ()
#28 0xffffffff80d32d73 in ip_input (m=0xfffff804d4d6c100) at /usr/src/sys/netinet/ip_input.c:606
Backtrace stopped: frame did not save the PC
Comment 2 Viktor Dukhovni 2019-03-28 22:35:03 UTC
With fantastic help from Mark Johnston, the issue has been plausibly narrowed down to the new epoch_tracker code in 12.0 not handling IPv6 via stf0 robustly.  The crash dump with "option INVARIANTS" looks like:

panic: Assertion etd->et_td == (struct thread *)td failed at /usr/src/sys/sys/epoch_private.h:178
cpuid = 7
time = 1553806077
KDB: stack backtrace:
#0 0xffffffff80bbb047 at kdb_backtrace+0x67
#1 0xffffffff80b737c3 at vpanic+0x1a3
#2 0xffffffff80b735a3 at panic+0x43
#3 0xffffffff80c68357 at epoch_exit_preempt+0x1e7
#4 0xffffffff80c6b35d at if_addr_runlock+0x1d
#5 0xffffffff82e33b67 at stf_getsrcifa6+0x1a7
#6 0xffffffff82e33488 at stf_output+0x48
#7 0xffffffff80dc356f at ip6_output+0x1daf
#8 0xffffffff80d85d74 at tcp_output+0x1be4
#9 0xffffffff80d97739 at tcp_usr_disconnect+0x69
#10 0xffffffff80c07eec at soclose+0x9c
#11 0xffffffff80b1d86a at _fdrop+0x1a
#12 0xffffffff80b20abc at closef+0x1ec
#13 0xffffffff80b1dea0 at closefp+0xa0
#14 0xffffffff8103ca58 at amd64_syscall+0x278
#15 0xffffffff8101589d at fast_syscall_common+0x101

I am now testing a work-around patch...
Comment 3 Viktor Dukhovni 2019-03-29 18:54:19 UTC
The patch appears to have stabilized the system.  No panic yet with workloads similar to those that trigger panics in stock 12.0 kernels.  If it holds up for a few more days, I'll declare victory.
Comment 4 Mark Johnston freebsd_committer 2019-03-29 18:59:38 UTC
For reference, the patch is here: https://people.freebsd.org/~markj/patches/stable_12_epoch.diff

Written by HPS, it fixes the same problem as r340413 but without breaking the KBI.  IMO we should commit it directly to stable/12.
Comment 5 Mark Johnston freebsd_committer 2019-03-29 19:00:11 UTC
Trimming CCs.
Comment 6 Viktor Dukhovni 2019-04-01 06:22:09 UTC
My system continues to be stable, and perform well.  This ticket may not need to stay open, unless that's useful to ensure that the patch gets merged into 12.0.
Comment 7 commit-hook freebsd_committer 2019-04-01 14:19:49 UTC
A commit references this bug:

Author: markj
Date: Mon Apr  1 14:19:10 UTC 2019
New revision: 345764
URL: https://svnweb.freebsd.org/changeset/base/345764

Log:
  Fix if_(m)addr_rlock().

  The use of a per-ifnet epoch context meant that these KPIs were not
  reentrant.  This was fixed in head in r340413, but the change cannot
  be MFCed because it breaks the KBI by modifying struct thread.  This
  is a direct commit to stable/12 which uses a per-CPU mutex to fix
  the problem without changing the KBI.

  PR:		236846
  Submitted by:	hselasky
  Reported and tested by:	Viktor Dukhovni <ietf-dane@dukhovni.org>
  Reviewed by:	hselasky (previous version)
  Differential Revision:	https://reviews.freebsd.org/D19764

Changes:
  stable/12/sys/net/if.c
  stable/12/sys/net/if_var.h
Comment 8 Mark Johnston freebsd_committer 2019-04-01 14:20:57 UTC
(In reply to Viktor Dukhovni from comment #6)
The change is now in stable/12.  I'll keep the bug report open until we get this into 12.0.
Comment 9 Mark Johnston freebsd_committer 2019-06-19 17:52:57 UTC
This is now fixed in 12.0 with an EN:
https://svnweb.freebsd.org/changeset/base/349198
Comment 10 Viktor Dukhovni 2019-06-19 19:06:45 UTC
Quick question, if I may, I am new to FreeBSD and not familiar with what an "EN" is.  What do I need to do to deploy the "EN" code?  Is that available now, when 12.1 ships, or at some intermediate checkpoint?

Or is the idea that I switch by svn src from "stable" back to the releng branch and build that?
Comment 11 Kurt Jaeger freebsd_committer 2019-06-19 19:10:51 UTC
(In reply to Viktor Dukhovni from comment #10)
Your next action depends on the version of -stable you are running.

The fix for stable was in r345764. If you are after that version, you're fine.

If you run stable from before that version, upgrade to the latest -stable.
Comment 12 Mark Johnston freebsd_committer 2019-06-19 19:10:59 UTC
(In reply to Viktor Dukhovni from comment #10)
It is available as a binary patch for 12.0, via freebsd-update.  I should have linked to the errata advisory instead, which provides instructions for updating: https://www.freebsd.org/security/advisories/FreeBSD-EN-19:11.net.asc
Comment 13 Viktor Dukhovni 2019-06-19 19:21:46 UTC
Thanks!  Yes, that link makes it all clear.  I would prefer to switch back to releng and use freebsd-update, so I may do that soon.  Do you know whether the stf "tentative" fix is also in "releng"?  Or is that still only in stable?

  r346434 | markj | 2019-04-20 07:01:47 -0400 (Sat, 20 Apr 2019) | 3 lines

  MFC r345739:
  Do not perform DAD on stf(4) interfaces.
Comment 14 Mark Johnston freebsd_committer 2019-06-19 19:25:38 UTC
(In reply to Viktor Dukhovni from comment #13)
That is only available in stable.  It doesn't really qualify for an EN given its relative severity and the fact that it does not appear to be a regression.  It will be included with 12.1, which is due to be released at the beginning of November.