Bug 220076 - [patch] [panic] [netgraph] repeatable kernel panic due to a race in ng_iface(4)
Summary: [patch] [panic] [netgraph] repeatable kernel panic due to a race in ng_iface(4)
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.0-STABLE
Hardware: Any Any
: --- Affects Some People
Assignee: Eugene Grosbein
URL:
Keywords: patch
: 171711 220077 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-06-17 08:44 UTC by Eugene Grosbein
Modified: 2017-10-01 19:43 UTC (History)
3 users (show)

See Also:
eugen: mfc-stable11+
eugen: mfc-stable10+


Attachments
protect ng_iface private data (2.26 KB, patch)
2017-06-17 08:44 UTC, Eugene Grosbein
no flags Details | Diff
protect ng_iface private data with rmlock (2.77 KB, patch)
2017-06-19 12:22 UTC, Eugene Grosbein
no flags Details | Diff
protect ng_iface private data with rmlock (2.82 KB, patch)
2017-06-19 12:54 UTC, Eugene Grosbein
no flags Details | Diff
protect ng_iface private data with rmlock (3.98 KB, patch)
2017-07-06 11:17 UTC, Eugene Grosbein
no flags Details | Diff
protect ng_iface private data with rmlock (4.12 KB, patch)
2017-09-28 11:20 UTC, Eugene Grosbein
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eugene Grosbein freebsd_committer freebsd_triage 2017-06-17 08:44:42 UTC
Created attachment 183566 [details]
protect ng_iface private data

I observe repeatable panics at netgraph level while doing stress test
for net/mpd5 daemon under stable/11 r317184. It connects, uses and disconnects lots of ngXX interfaces and corresponding netgraph nodes and hooks.

Crashdump points to ng_iface node which private data - set of hooks - may be modified in respond to userland request while another kernel thread sends data over hook being disconnected. Here is a scenario:

1. mpd runs its BundNcpsLeave() procedure for an interface calling NgSendMsg(csock, path, NGM_GENERIC_COOKIE, NGM_RMHOOK, &rm, sizeof(rm)) that leads to libnetgraph's NgDeliverMsg() and sendto() system call for AF_NETGRAPH.

The kernel reponds with ng_findhook->ng_destroy_hook->NG_HOOK_UNREF (_NG_HOOK_UNREF/ng_unref_hook)->NG_FREE_HOOK: free((hook), M_NETGRAPH_HOOK).

2. In parallel, userland process like ftpd sends some data over IPv4 socket to corresponding interface being up and running. It may utilize hook being freed same time by another kernel thread that leads to:

Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer     = 0x20:0xffffffff8097f249
stack pointer           = 0x28:0xfffffe0239542ec0
frame pointer           = 0x28:0xfffffe0239542f00
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 28999 (ftpd)
trap number             = 9
panic: general protection fault
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2c/frame 0xfffffe0239542840
kdb_backtrace() at kdb_backtrace+0x53/frame 0xfffffe0239542910
vpanic() at vpanic+0x249/frame 0xfffffe02395429e0
kproc_shutdown() at kproc_shutdown/frame 0xfffffe0239542a40
trap_fatal() at trap_fatal+0x60a/frame 0xfffffe0239542b70
trap() at trap+0x97c/frame 0xfffffe0239542dd0
trap_check() at trap_check+0x15/frame 0xfffffe0239542df0
calltrap() at calltrap+0x8/frame 0xfffffe0239542df0
--- trap 0x9, rip = 0xffffffff8097f249, rsp = 0xfffffe0239542ec0, rbp = 0xfffffe0239542f00 ---
ng_address_hook() at ng_address_hook+0x59/frame 0xfffffe0239542f00
ng_iface_send() at ng_iface_send+0x108/frame 0xfffffe0239542f90
ng_iface_output() at ng_iface_output+0x447/frame 0xfffffe0239543060
ip_output() at ip_output+0x1864/frame 0xfffffe0239543300
tcp_output() at tcp_output+0x2602/frame 0xfffffe02395436a0
tcp_disconnect() at tcp_disconnect+0x18e/frame 0xfffffe02395436e0
tcp_usr_disconnect() at tcp_usr_disconnect+0xe6/frame 0xfffffe0239543710
sodisconnect() at sodisconnect+0x62/frame 0xfffffe0239543740
soclose() at soclose+0x95/frame 0xfffffe02395437b0
soo_close() at soo_close+0x4d/frame 0xfffffe02395437e0
fo_close() at fo_close+0x31/frame 0xfffffe0239543810
_fdrop() at _fdrop+0x46/frame 0xfffffe0239543840
closef() at closef+0x2d7/frame 0xfffffe02395438f0
closefp() at closefp+0xde/frame 0xfffffe0239543940
kern_close() at kern_close+0xe7/frame 0xfffffe0239543990
sys_close() at sys_close+0x1f/frame 0xfffffe02395439b0
syscallenter() at syscallenter+0x4ff/frame 0xfffffe0239543a80
amd64_syscall() at amd64_syscall+0x2a/frame 0xfffffe0239543bb0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0239543bb0
--- syscall (6, FreeBSD ELF64, sys_close), rip = 0x801a4033a, rsp = 0x7fffffffd0a8, rbp = 0x7fffffffd0d0 ---
Uptime: 2h11m5s
Dumping 544 out of 8156 MB:..3%..12%..21%..33%..42%..53%..62%..71%..83%..92%

Reading symbols from /boot/modules/geom_mirror.ko...done.
Loaded symbols for /boot/modules/geom_mirror.ko
Reading symbols from /boot/modules/accf_http.ko...done.
Loaded symbols for /boot/modules/accf_http.ko
Reading symbols from /boot/modules/nvidia.ko...done.
Loaded symbols for /boot/modules/nvidia.ko
Reading symbols from /boot/modules/vboxdrv.ko...done.
Loaded symbols for /boot/modules/vboxdrv.ko
Reading symbols from /boot/modules/mmc.ko...done.
Loaded symbols for /boot/modules/mmc.ko
Reading symbols from /boot/modules/mmcsd.ko...done.
Loaded symbols for /boot/modules/mmcsd.ko
Reading symbols from /boot/modules/sdhci.ko...done.
Loaded symbols for /boot/modules/sdhci.ko
Reading symbols from /boot/modules/h_ertt.ko...done.
Loaded symbols for /boot/modules/h_ertt.ko
Reading symbols from /boot/modules/cc_chd.ko...done.
Loaded symbols for /boot/modules/cc_chd.ko
Reading symbols from /boot/modules/geom_sched.ko...done.
Loaded symbols for /boot/modules/geom_sched.ko
Reading symbols from /boot/modules/gsched_rr.ko...done.
Loaded symbols for /boot/modules/gsched_rr.ko
Reading symbols from /boot/modules/vboxnetflt.ko...done.
Loaded symbols for /boot/modules/vboxnetflt.ko
Reading symbols from /boot/modules/vboxnetadp.ko...done.
Loaded symbols for /boot/modules/vboxnetadp.ko
Reading symbols from /boot/modules/nullfs.ko...done.
Loaded symbols for /boot/modules/nullfs.ko
Reading symbols from /usr/local/modules/rtc.ko...done.
Loaded symbols for /usr/local/modules/rtc.ko
#0  doadump (textdump=1) at /data2/src/sys/kern/kern_shutdown.c:298
298             dumptid = curthread->td_tid;
(kgdb) bt
#0  doadump (textdump=1) at /data2/src/sys/kern/kern_shutdown.c:298
#1  0xffffffff807a0828 in kern_reboot (howto=260) at /data2/src/sys/kern/kern_shutdown.c:366
#2  0xffffffff807a125f in vpanic (fmt=0xffffffff80cf5311 "%s", ap=0xfffffe0239542a20)
    at /data2/src/sys/kern/kern_shutdown.c:759
#3  0xffffffff807a12d0 in panic (fmt=0xffffffff80cf5311 "%s") at /data2/src/sys/kern/kern_shutdown.c:690
#4  0xffffffff80c06a0a in trap_fatal (frame=0xfffffe0239542e00, eva=0) at /data2/src/sys/amd64/amd64/trap.c:801
#5  0xffffffff80c0604c in trap (frame=0xfffffe0239542e00) at /data2/src/sys/amd64/amd64/trap.c:549
#6  0xffffffff80c07085 in trap_check (frame=0xfffffe0239542e00) at /data2/src/sys/amd64/amd64/trap.c:602
#7  0xffffffff80bdeba3 in calltrap () at /data2/src/sys/amd64/amd64/exception.S:236
#8  0xffffffff8097f249 in ng_address_hook (here=0x0, item=0xfffff8011ddd1f00, hook=0xfffff801f8232300, retaddr=0)
    at /data2/src/sys/netgraph/ng_base.c:3586
#9  0xffffffff80986548 in ng_iface_send (ifp=0xfffff801f8ef2800, m=0xfffff801f8526100, sa=2 '\002')
    at /data2/src/sys/netgraph/ng_iface.c:451
#10 0xffffffff80985c97 in ng_iface_output (ifp=0xfffff801f8ef2800, m=0xfffff801f8526100, dst=0xfffff8011db98720, 
    ro=0xfffff8011db98700) at /data2/src/sys/netgraph/ng_iface.c:386
#11 0xffffffff809cae14 in ip_output (m=0xfffff801f8526100, opt=0x0, ro=0xfffff8011db98700, flags=0, imo=0x0, 
    inp=0xfffff8011db98570) at /data2/src/sys/netinet/ip_output.c:655
#12 0xffffffff809e08d2 in tcp_output (tp=0xfffff801f8258410) at /data2/src/sys/netinet/tcp_output.c:1446
#13 0xffffffff809f5f4e in tcp_disconnect (tp=0xfffff801f8258410) at /data2/src/sys/netinet/tcp_usrreq.c:1946
#14 0xffffffff809f29a6 in tcp_usr_disconnect (so=0xfffff8011d8de360) at /data2/src/sys/netinet/tcp_usrreq.c:674
#15 0xffffffff80884072 in sodisconnect (so=0xfffff8011d8de360) at /data2/src/sys/kern/uipc_socket.c:1051
#16 0xffffffff808839b5 in soclose (so=0xfffff8011d8de360) at /data2/src/sys/kern/uipc_socket.c:869
#17 0xffffffff8084d67d in soo_close (fp=0xfffff8011df23b40, td=0xfffff801f8c5d000) at /data2/src/sys/kern/sys_socket.c:334
#18 0xffffffff8072bee1 in fo_close (fp=0xfffff8011df23b40, td=0xfffff801f8c5d000) at file.h:346
#19 0xffffffff80726b86 in _fdrop (fp=0xfffff8011df23b40, td=0xfffff801f8c5d000) at /data2/src/sys/kern/kern_descrip.c:2849
#20 0xffffffff8072b1f7 in closef (fp=0xfffff8011df23b40, td=0xfffff801f8c5d000) at /data2/src/sys/kern/kern_descrip.c:2430
#21 0xffffffff8072768e in closefp (fdp=0xfffff80007104000, fd=6, fp=0xfffff8011df23b40, td=0xfffff801f8c5d000, holdleaders=0)
    at /data2/src/sys/kern/kern_descrip.c:1191
#22 0xffffffff80728417 in kern_close (td=0xfffff801f8c5d000, fd=6) at /data2/src/sys/kern/kern_descrip.c:1239
#23 0xffffffff8072831f in sys_close (td=0xfffff801f8c5d000, uap=0xfffffe0239543b58)
    at /data2/src/sys/kern/kern_descrip.c:1218
#24 0xffffffff80c07b7f in syscallenter (td=0xfffff801f8c5d000, sa=0xfffffe0239543b48) at subr_syscall.c:135
#25 0xffffffff80c0741a in amd64_syscall (td=0xfffff801f8c5d000, traced=0) at /data2/src/sys/amd64/amd64/trap.c:902
#26 0xffffffff80bdee8b in Xfast_syscall () at /data2/src/sys/amd64/amd64/exception.S:396
#27 0x0000000801a4033a in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language:  auto; currently minimal
(kgdb) frame 8
#8  0xffffffff8097f249 in ng_address_hook (here=0x0, item=0xfffff8011ddd1f00, hook=0xfffff801f8232300, retaddr=0)
    at /data2/src/sys/netgraph/ng_base.c:3586
3586                NG_HOOK_NOT_VALID(peer = NG_HOOK_PEER(hook)) ||
(kgdb) l
3581             * that the peer is still connected (even if invalid,) we know
3582             * that the peer node is present, though maybe invalid.
3583             */
3584            TOPOLOGY_RLOCK();
3585            if ((hook == NULL) || NG_HOOK_NOT_VALID(hook) ||
3586                NG_HOOK_NOT_VALID(peer = NG_HOOK_PEER(hook)) ||
3587                NG_NODE_NOT_VALID(peernode = NG_PEER_NODE(hook))) {
3588                    NG_FREE_ITEM(item);
3589                    TRAP_ERROR();
3590                    TOPOLOGY_RUNLOCK();
(kgdb) p *hook
$1 = {
  hk_name = 0xfffff801f8232300 "ÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞP¯\003\201ÿÿÿÿÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­ÞÞÀ­Þ"..., 
  hk_private = 0xdeadc0dedeadc0de, hk_flags = -559038242, hk_type = -559038242, hk_peer = 0xdeadc0dedeadc0de, 
  hk_node = 0xdeadc0dedeadc0de, hk_hooks = {le_next = 0xdeadc0dedeadc0de, le_prev = 0xdeadc0dedeadc0de}, 
  hk_rcvmsg = 0xdeadc0dedeadc0de, hk_rcvdata = 0xdeadc0dedeadc0de, hk_refs = -559038242}
(kgdb) 

Attached patch introduces per-node rwlock for ng_iface to protect usage of its private data while it is being modified. Without the patch, my stress test for mpd procudes this panic in short time. With patch applied, it was running over 11 hours non-stop and no panics.
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2017-06-17 14:32:54 UTC
*** Bug 220077 has been marked as a duplicate of this bug. ***
Comment 2 Cassiano Peixoto 2017-06-19 11:51:13 UTC
Eugene,

I already had this issue about a year ago. As described here: 
https://sourceforge.net/p/mpd/discussion/44693/thread/d9af947a/

Should i apply this patch as well or wait?
Comment 3 Eugene Grosbein freebsd_committer freebsd_triage 2017-06-19 12:22:41 UTC
Created attachment 183625 [details]
protect ng_iface private data with rmlock

Alternative version of the patch using rmlock instead of rwlock for performance.
Comment 4 Eugene Grosbein freebsd_committer freebsd_triage 2017-06-19 12:24:34 UTC
(In reply to Cassiano Peixoto from comment #2)

Yes, you should. Better try newer version of this patch I just attached. It should impose less locking overhead than first version providing better performance.
Comment 5 Eugene Grosbein freebsd_committer freebsd_triage 2017-06-19 12:54:42 UTC
Created attachment 183627 [details]
protect ng_iface private data with rmlock

Avoid lock leak in the ng_iface_newhook in case of EISCONN.
Comment 6 Cassiano Peixoto 2017-06-19 20:09:39 UTC
(In reply to Eugene Grosbein from comment #4)
Ok I'll do it. Do i need just to recompile the kernel to apply the changes?
Comment 7 Eugene Grosbein freebsd_committer freebsd_triage 2017-06-19 20:46:23 UTC
(In reply to Cassiano Peixoto from comment #6)

Yes, if you include NETGRAPH into your kernel statically. Otherwise, rebuild ng_iface.ko kernel module only.
Comment 8 Eugene Grosbein freebsd_committer freebsd_triage 2017-07-06 11:17:34 UTC
Created attachment 184119 [details]
protect ng_iface private data with rmlock

Use reader lock to additionally protect access to ng_iface private data through `findhook' method to avoid usage of hook being removed by another kernel thread in parallel.
Comment 9 Eugene Grosbein freebsd_committer freebsd_triage 2017-09-19 17:56:42 UTC
*** Bug 171711 has been marked as a duplicate of this bug. ***
Comment 10 Eugene Grosbein freebsd_committer freebsd_triage 2017-09-20 20:44:59 UTC
My PR.
Comment 11 commit-hook freebsd_committer freebsd_triage 2017-09-21 20:17:08 UTC
A commit references this bug:

Author: eugen
Date: Thu Sep 21 20:16:11 UTC 2017
New revision: 323873
URL: https://svnweb.freebsd.org/changeset/base/323873

Log:
  Unprotected modification of ng_iface(4) private data leads to kernel panic.
  Fix a race with per-node read-mostly lock and refcounting for a hook.

  PR:			220076
  Tested by:		peixoto.cassiano
  Approved by:		avg (mentor), mav (mentor)
  MFC after:		1 week
  Relnotes:		yes
  Differential Revision:	https://reviews.freebsd.org/D12435

Changes:
  head/sys/netgraph/ng_iface.c
Comment 12 Eugene Grosbein freebsd_committer freebsd_triage 2017-09-28 11:20:01 UTC
Created attachment 186781 [details]
protect ng_iface private data with rmlock

Include <sys/lock.h> to not break build for stable/10.
Comment 13 commit-hook freebsd_committer freebsd_triage 2017-09-28 11:27:03 UTC
A commit references this bug:

Author: eugen
Date: Thu Sep 28 11:26:38 UTC 2017
New revision: 324081
URL: https://svnweb.freebsd.org/changeset/base/324081

Log:
  Correction after r323873: #include <sys/lock.h> in addition to <sys/rmlock.h>

  PR:		220076
  Approved by:	mav (mentor)
  MFC after:	3 days

Changes:
  head/sys/netgraph/ng_iface.c
Comment 14 commit-hook freebsd_committer freebsd_triage 2017-10-01 19:40:17 UTC
A commit references this bug:

Author: eugen
Date: Sun Oct  1 19:39:27 UTC 2017
New revision: 324175
URL: https://svnweb.freebsd.org/changeset/base/324175

Log:
  MFC r323873, r324081: Unprotected modification of ng_iface(4)
  private data leads to kernel panic. Fix a race with per-node
  read-mostly lock and refcounting for a hook.

  PR:			220076
  Tested by:		peixoto.cassiano
  Approved by:		mav (mentor)
  Relnotes:		yes
  Differential Revision:	https://reviews.freebsd.org/D12435

Changes:
_U  stable/11/
  stable/11/sys/netgraph/ng_iface.c
Comment 15 commit-hook freebsd_committer freebsd_triage 2017-10-01 19:41:24 UTC
A commit references this bug:

Author: eugen
Date: Sun Oct  1 19:40:29 UTC 2017
New revision: 324176
URL: https://svnweb.freebsd.org/changeset/base/324176

Log:
  MFC r323873, r324081: Unprotected modification of ng_iface(4)
  private data leads to kernel panic. Fix a race with per-node
  read-mostly lock and refcounting for a hook.

  PR:			220076
  Tested by:		peixoto.cassiano
  Approved by:		mav (mentor)
  Relnotes:		yes
  Differential Revision:	https://reviews.freebsd.org/D12435

Changes:
_U  stable/10/
  stable/10/sys/netgraph/ng_iface.c