Bug 244241 - ng_eiface: panic: epoch_wait_preempt() called in the middle of an epoch section of the same epoch
Summary: ng_eiface: panic: epoch_wait_preempt() called in the middle of an epoch secti...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: amd64 Any
: --- Affects Many People
Assignee: Gleb Smirnoff
URL:
Keywords: crash, regression
Depends on:
Blocks:
 
Reported: 2020-02-20 09:55 UTC by Aleksandr Fedorov
Modified: 2020-02-21 04:26 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Aleksandr Fedorov 2020-02-20 09:55:43 UTC
After creating 10-20 ngethN interfaces (ngctl mkpeer . eiface test ether), I observe the following panic:

Unread portion of the kernel message buffer:                                                                                                                                                      
panic: epoch_wait_preempt() called in the middle of an epoch section of the same epoch                                                                                                            
cpuid = 2                                                                                                                                                                                         
time = 1582189253                                                                                                                                                                                 
KDB: stack backtrace:                                                                                                                                                                             
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe027869a520                                                                                                                    
vpanic() at vpanic+0x185/frame 0xfffffe027869a580                                                                                                                                                 
panic() at panic+0x43/frame 0xfffffe027869a5e0                                                                                                                                                    
epoch_wait_preempt() at epoch_wait_preempt+0x293/frame 0xfffffe027869a640                                                                                                                         
if_alloc_domain() at if_alloc_domain+0x1d0/frame 0xfffffe027869a680                                                                                                                               
ng_eiface_constructor() at ng_eiface_constructor+0x74/frame 0xfffffe027869a6c0                                                                                                                    
ng_make_node() at ng_make_node+0xba/frame 0xfffffe027869a6f0
ng_mkpeer() at ng_mkpeer+0x24/frame 0xfffffe027869a740
ng_apply_item() at ng_apply_item+0x49d/frame 0xfffffe027869a7c0
ng_snd_item() at ng_snd_item+0x2b0/frame 0xfffffe027869a800
ngc_send() at ngc_send+0x1b7/frame 0xfffffe027869a8b0
sosend_generic() at sosend_generic+0x44c/frame 0xfffffe027869a960
sosend() at sosend+0x66/frame 0xfffffe027869a990 
kern_sendit() at kern_sendit+0x21d/frame 0xfffffe027869aa20
sendit() at sendit+0x1d5/frame 0xfffffe027869aa70
sys_sendto() at sys_sendto+0x4d/frame 0xfffffe027869aac0
amd64_syscall() at amd64_syscall+0x168/frame 0xfffffe027869abf0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe027869abf0
--- syscall (133, FreeBSD ELF64, sys_sendto), rip = 0x800475c3a, rsp = 0x7fffffffce08, rbp = 0x7fffffffce50 ---
Uptime: 18h46m4s
Dumping 8823 out of 130910 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

__curthread () at /afedorov/freebsd-develop/sys/amd64/include/pcpu_aux.h:55
55      /afedorov/freebsd-develop/sys/amd64/include/pcpu_aux.h: No such file or directory.
(kgdb) bt
#0  __curthread () at /afedorov/freebsd-develop/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=1) at /afedorov/freebsd-develop/sys/kern/kern_shutdown.c:393
#2  0xffffffff80bc7c50 in kern_reboot (howto=260) at /afedorov/freebsd-develop/sys/kern/kern_shutdown.c:480
#3  0xffffffff80bc80ad in vpanic (fmt=<optimized out>, ap=<optimized out>) at /afedorov/freebsd-develop/sys/kern/kern_shutdown.c:910
#4  0xffffffff80bc7e03 in panic (fmt=<unavailable>) at /afedorov/freebsd-develop/sys/kern/kern_shutdown.c:836
#5  0xffffffff80c0c9e3 in epoch_wait_preempt (epoch=0xfffff8010691ef00) at /afedorov/freebsd-develop/sys/kern/subr_epoch.c:610
#6  0xffffffff80cd1fe0 in if_alloc_domain (type=6 '\006', numa_domain=255) at /afedorov/freebsd-develop/sys/net/if.c:541
#7  0xffffffff82e6a094 in ng_eiface_constructor (node=0xfffff801fa8df900) at /afedorov/freebsd-develop/sys/netgraph/ng_eiface.c:393
#8  0xffffffff82e6d0ba in ng_make_node (typename=0xfffff801f5238538 "eiface", nodepp=0xfffffe027869a708) at /afedorov/freebsd-develop/sys/netgraph/ng_base.c:618
#9  0xffffffff82e71d74 in ng_mkpeer (node=0xfffff801fa8df800, name=0xfffff801f5238558 "test", name2=0xfffff801f5238578 "ether", type=<unavailable>)
    at /afedorov/freebsd-develop/sys/netgraph/ng_base.c:1547
#10 0xffffffff82e6fbbd in ng_generic_msg (here=0xfffff801fa8df800, item=0xfffff812b7f86080, lasthook=<optimized out>) at /afedorov/freebsd-develop/sys/netgraph/ng_base.c:2538
#11 ng_apply_item (node=0xfffff801fa8df800, item=0xfffff812b7f86080, rw=<optimized out>) at /afedorov/freebsd-develop/sys/netgraph/ng_base.c:2438
#12 0xffffffff82e6f520 in ng_snd_item (item=0xfffff812b7f86080, flags=0) at /afedorov/freebsd-develop/sys/netgraph/ng_base.c:2321
#13 0xffffffff82e79db7 in ngc_send (so=<optimized out>, flags=<optimized out>, m=0xfffff802c6a2b000, addr=<optimized out>, control=<optimized out>, td=<optimized out>)
    at /afedorov/freebsd-develop/sys/netgraph/ng_socket.c:342
#14 0xffffffff80c678fc in sosend_generic (so=0xfffffe01e5d28710, addr=0xfffff801071785f0, uio=0xfffffe027869a9a8, top=0xfffff802c6a2b000, control=<optimized out>, flags=0, 
    td=0xfffffe0277d02300) at /afedorov/freebsd-develop/sys/kern/uipc_socket.c:1650
#15 0xffffffff80c67b36 in sosend (so=<unavailable>, addr=<unavailable>, uio=<unavailable>, top=<unavailable>, control=0x0, flags=<unavailable>, td=0xfffffe0277d02300)
    at /afedorov/freebsd-develop/sys/kern/uipc_socket.c:1705
#16 0xffffffff80c6e98d in kern_sendit (td=<optimized out>, s=3, mp=0xfffffe027869aa80, flags=0, control=<optimized out>, segflg=UIO_USERSPACE)
    at /afedorov/freebsd-develop/sys/kern/uipc_syscalls.c:798
#17 0xffffffff80c6ed35 in sendit (td=0xfffffe0277d02300, s=<optimized out>, mp=0xfffffe027869aa80, flags=<optimized out>) at /afedorov/freebsd-develop/sys/kern/uipc_syscalls.c:723
#18 0xffffffff80c6eb4d in sys_sendto (td=<unavailable>, uap=<optimized out>) at /afedorov/freebsd-develop/sys/kern/uipc_syscalls.c:841
#19 0xffffffff8106d8f8 in syscallenter (td=<optimized out>) at /afedorov/freebsd-develop/sys/amd64/amd64/../../kern/subr_syscall.c:162
#20 amd64_syscall (td=0xfffffe0277d02300, traced=0) at /afedorov/freebsd-develop/sys/amd64/amd64/trap.c:1162
#21 <signal handler called>
#22 0x0000000800475c3a in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffffffce08
(kgdb)
Comment 1 Hans Petter Selasky freebsd_committer 2020-02-20 10:08:22 UTC
The netgraph code might need to be refactored. Allocating a network interface inside the send path seems risky.
Comment 2 Aleksandr Fedorov 2020-02-20 13:59:49 UTC
Yes, it seems that netgraph needs some love.

In this case, the function ngc_send() entering to the epoch. So, this is the control path, not data. Data path enter to the epoch from ngthread(), ng_callout_trampoline() and if node really need it:
http://bxr.su/FreeBSD/sys/netgraph/ng_base.c#3423
http://bxr.su/FreeBSD/sys/netgraph/ng_base.c#3778
http://bxr.su/FreeBSD/sys/netgraph/ng_device.c#475

After a quick research (http://bxr.su/search?q=NET_EPOCH_ENTER&defs=&refs=&path=sys%2Fnetgraph&project=FreeBSD):

1. http://bxr.su/FreeBSD/sys/netgraph/ng_ip_input.c#131 - already in epoch.
2. http://bxr.su/FreeBSD/sys/netgraph/ng_ether.c#601 - already in epoch.
3. http://bxr.su/FreeBSD/sys/netgraph/ng_ether.c#743 - already in epoch.
4. http://bxr.su/FreeBSD/sys/netgraph/ng_eiface.c#517 - already in epoch.
5. http://bxr.su/FreeBSD/sys/netgraph/ng_iface.c#735 - already in epoch.

Do we really need to entering the epoch from control path?
http://bxr.su/FreeBSD/sys/netgraph/ng_socket.c#341
Comment 3 Aleksandr Fedorov 2020-02-20 15:03:44 UTC
Sorry, I'm new to epoch (9). Can we sleep(9) after NET_EPOCH_ENTER? Can a call to NET_EPOCH_ENTER be recursive?
Comment 4 Hans Petter Selasky freebsd_committer 2020-02-20 15:06:16 UTC
No, EPOCH(9) is similar to the properties of a mtx_lock() / mtx_unlock(). You cannot sleep(9) under EPOCH(9).

--HPS
Comment 5 Aleksandr Fedorov 2020-02-20 16:21:18 UTC
Thank you for the clarification.

Than, if you call #ngctl mkpeer . [NODE_TYPE] [SRC_HOOK] [DST_HOOK]
It's go through ngc_send(), which enter to the epoch: http://bxr.su/FreeBSD/sys/netgraph/ng_socket.c#341

And execute node constructor, but most of the node constructors calls malloc(priv,..., M_WAIT_OK) - http://bxr.su/FreeBSD/sys/netgraph/ng_eiface.c#391

As, I understand this is incorrect.
Comment 6 Hans Petter Selasky freebsd_committer 2020-02-20 16:31:52 UTC
Yes, that is correct.

--HPS
Comment 7 commit-hook freebsd_committer 2020-02-21 04:11:00 UTC
A commit references this bug:

Author: glebius
Date: Fri Feb 21 04:10:42 UTC 2020
New revision: 358193
URL: https://svnweb.freebsd.org/changeset/base/358193

Log:
  Revert one half of previous change r357558.  Don't enter the epoch on
  sends to control socket.  Control socket messages can run constructors
  of nodes and other stuff that is allowed to M_WAITOK.

  PR:		244241

Changes:
  head/sys/netgraph/ng_socket.c
Comment 8 Gleb Smirnoff freebsd_committer 2020-02-21 04:19:03 UTC
Sorry for the regression. Head should be good now.