Bug 216304 - Adding xn0 to bridge0 causes kernel panic
Summary: Adding xn0 to bridge0 causes kernel panic
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.0-RELEASE
Hardware: amd64 Any
: --- Affects Some People
Assignee: Kristof Provost
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-01-19 23:09 UTC by mark
Modified: 2017-02-01 21:50 UTC (History)
4 users (show)

See Also:
kp: mfc-stable11+
kp: mfc-stable10+


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description mark 2017-01-19 23:09:21 UTC
We've encountered kernel panic in FreeBSD 11-RELEASE when attempting to add xn0 as a member of bridge0. The kernel panic happens immediately after the command to add xn0 to bridge0 is issued. Oddly, the kernel panic doesn't occur after upgrading in-place to 11.0-RELEASE from 10.3-RELEASE and proceeding to add xn0 to bridge0. This seems to only be an issue with fresh 11.0-RELEASE installs. 

All installs we've seen this issue on are virtual machines running on Xen 3.4.4 hypervisors. The virtual machine we upgraded from 10.3 to 11.0 (where adding xn0 to bridge0 works fine) is also on a Xen 3.4.4 hypervisor.

Output of "uname -r" on 11.0-RELEASE vm with kernel panic issue:
11.0-RELEASE-p2

Output of "uname -r" on 11.0-RELEASE vm upgraded from 10.3-RELEASE without kernel panic issue:
11.0-RELEASE-p2

Commands used on both servers to add bridge0 and then add xn0 to bridge0:
ifconfig bridge create
ifconfig bridge0 addm xn0

Output of "kgdb kernel.debug /var/crash/vmcore.0":

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
Sleeping thread (tid 100076, pid 831) owns a non-sleepable lock
KDB: stack backtrace of thread 100076:
#0 0xffffffff80ae46e2 at mi_switch+0xd2
#1 0xffffffff80b3279a at sleepq_timedwait+0x3a
#2 0xffffffff80ae4091 at _sleep+0x281
#3 0xffffffff8096e9c8 at xn_ioctl+0x5d8
#4 0xffffffff822194b3 at bridge_ioctl_add+0x4b3
#5 0xffffffff8221af8f at bridge_ioctl+0x29f
#6 0xffffffff80bdcbec at ifioctl+0xfbc
#7 0xffffffff80b41ab4 at kern_ioctl+0x2d4
#8 0xffffffff80b41771 at sys_ioctl+0x171
#9 0xffffffff80fa168e at amd64_syscall+0x4ce
#10 0xffffffff80f8442b at Xfast_syscall+0xfb
panic: sleeping thread
cpuid = 0
KDB: stack backtrace:
#0 0xffffffff80b24077 at kdb_backtrace+0x67
#1 0xffffffff80ad93e2 at vpanic+0x182
#2 0xffffffff80ad9253 at panic+0x43
#3 0xffffffff80b39a99 at propagate_priority+0x299
#4 0xffffffff80b3a59f at turnstile_wait+0x3ef
#5 0xffffffff80ab493d at __mtx_lock_sleep+0x13d
#6 0xffffffff8221d4c5 at bridge_output+0x75
#7 0xffffffff80be286e at ether_output+0x68e
#8 0xffffffff80c62fe7 at ip_output+0x16c7
#9 0xffffffff80cf593e at tcp_output+0x191e
#10 0xffffffff80d01396 at tcp_timer_rexmt+0x526
#11 0xffffffff80af325a at softclock_call_cc+0x18a
#12 0xffffffff80af37d4 at softclock+0x94
#13 0xffffffff80a9340f at intr_event_execute_handlers+0x20f
#14 0xffffffff80a93676 at ithread_loop+0xc6
#15 0xffffffff80a90055 at fork_exit+0x85
#16 0xffffffff80f8467e at fork_trampoline+0xe
Uptime: 2m27s
Dumping 85 out of 479 MB:..19%..38%..57%..76%..94%

Reading symbols from /boot/kernel/if_bridge.ko...done.
Loaded symbols for /boot/kernel/if_bridge.ko
Reading symbols from /boot/kernel/bridgestp.ko...done.
Loaded symbols for /boot/kernel/bridgestp.ko
#0  doadump (textdump=<value optimized out>) at pcpu.h:221
221		__asm("movq %%gs:%1,%0" : "=r" (td)
(kgdb) backtrace
#0  doadump (textdump=<value optimized out>) at pcpu.h:221
#1  0xffffffff80ad8e69 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:366
#2  0xffffffff80ad941b in vpanic (fmt=<value optimized out>, ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:759
#3  0xffffffff80ad9253 in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:690
#4  0xffffffff80b39a99 in propagate_priority (td=<value optimized out>) at /usr/src/sys/kern/subr_turnstile.c:226
#5  0xffffffff80b3a59f in turnstile_wait (ts=<value optimized out>, owner=<value optimized out>, queue=<value optimized out>)
    at /usr/src/sys/kern/subr_turnstile.c:742
#6  0xffffffff80ab493d in __mtx_lock_sleep (c=<value optimized out>, tid=18446735277668753408, opts=<value optimized out>, 
    file=<value optimized out>, line=<value optimized out>) at /usr/src/sys/kern/kern_mutex.c:583
#7  0xffffffff8221d4c5 in bridge_output () from /boot/kernel/if_bridge.ko
#8  0xffffffff80be286e in ether_output (ifp=<value optimized out>, m=<value optimized out>, dst=0xfffff800033bd9b0, ro=<value optimized out>)
    at /usr/src/sys/net/if_ethersubr.c:407
#9  0xffffffff80c62fe7 in ip_output (m=0x0, opt=<value optimized out>, ro=<value optimized out>, flags=<value optimized out>, 
    imo=<value optimized out>, inp=<value optimized out>) at /usr/src/sys/netinet/ip_output.c:661
#10 0xffffffff80cf593e in tcp_output (tp=<value optimized out>) at /usr/src/sys/netinet/tcp_output.c:1422
#11 0xffffffff80d01396 in tcp_timer_rexmt (xtp=<value optimized out>) at /usr/src/sys/netinet/tcp_timer.c:812
#12 0xffffffff80af325a in softclock_call_cc (c=<value optimized out>, cc=<value optimized out>, direct=<value optimized out>)
    at /usr/src/sys/kern/kern_timeout.c:729
#13 0xffffffff80af37d4 in softclock (arg=<value optimized out>) at /usr/src/sys/kern/kern_timeout.c:867
#14 0xffffffff80a9340f in intr_event_execute_handlers (p=<value optimized out>, ie=<value optimized out>)
    at /usr/src/sys/kern/kern_intr.c:1262
#15 0xffffffff80a93676 in ithread_loop (arg=<value optimized out>) at /usr/src/sys/kern/kern_intr.c:1275
#16 0xffffffff80a90055 in fork_exit (callout=0xffffffff80a935b0 <ithread_loop>, arg=0xfffff800031c8be0, frame=0xfffffe002b696c00)
    at /usr/src/sys/kern/kern_fork.c:1038
#17 0xffffffff80f8467e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:611
#18 0x0000000000000000 in ?? ()
Current language:  auto; currently minimal
Comment 1 Kristof Provost freebsd_committer freebsd_triage 2017-01-20 21:47:44 UTC
Tests with WITNESS show:

xn0: performing interface reset due to feature change
uma_zalloc_arg: zone "64" with the following non-sleepable locks held:
exclusive sleep mutex if_bridge (if_bridge) r = 0 (0xfffff80003f0aa18) locked @ /usr/home/rootbsd/freebsd/sys/modules/if_bridge/../../net/if_bridge.c:826
stack backtrace:
#0 0xffffffff80a66720 at witness_debugger+0x70
#1 0xffffffff80a679c7 at witness_warn+0x3d7
#2 0xffffffff80cb00bb at uma_zalloc_arg+0x3b
#3 0xffffffff809f328c at malloc+0xfc
#4 0xffffffff80a57c45 at sbuf_new+0x95
#5 0xffffffff808eea18 at xs_rm+0x28
#6 0xffffffff808eb411 at xn_ioctl+0x211
#7 0xffffffff8221d4b3 at bridge_ioctl_add+0x4b3
#8 0xffffffff8221ef8f at bridge_ioctl+0x29f
#9 0xffffffff80ae577a at ifioctl+0x104a
#10 0xffffffff80a6ab64 at kern_ioctl+0x214
#11 0xffffffff80a6a8e1 at sys_ioctl+0x171
#12 0xffffffff80e35fd4 at amd64_syscall+0x314
#13 0xffffffff80e1c1ab at Xfast_syscall+0xfb
Comment 2 Kristof Provost freebsd_committer freebsd_triage 2017-01-20 22:18:05 UTC
I think what happens here is that the bridge code (through bridge_ioctl_add() calls the xen driver's ioctl() handler for SIOCSIFCAP, which through xn_ioctl() calls xs_rm(xenbus_get_node(dev), "feature-gso-tcp4"), which tries to compose a string with the sbuf functions, which use a M_WAITOK allocation.

That means that we can end up sleeping (because malloc(M_WAITOK)) with the bridge lock (a standard mutex) held.
That violates locking rules, by sleeping with a mutex held, so WITNESS warns us about this.

If we're unlucky enough to actually try to acquire the bridge lock from another thread (say because we want to transmit a packet) we can end up panic()ing.

It's not obvious to me how this can be fixed however. I'm cc-ing royger because he touched the xen-netfront code at some point.

Perhaps we can allocate the strings the xs_rm() needs at device initialisation time, but that would require the result of xenbus_get_node(dev) to be constant, and I don't know if that's a valid assumption.
Comment 3 Kristof Provost freebsd_committer freebsd_triage 2017-01-22 19:25:33 UTC
Proposed patch: https://reviews.freebsd.org/D9290
Comment 4 commit-hook freebsd_committer freebsd_triage 2017-01-25 21:26:27 UTC
A commit references this bug:

Author: kp
Date: Wed Jan 25 21:25:26 UTC 2017
New revision: 312782
URL: https://svnweb.freebsd.org/changeset/base/312782

Log:
  bridge: Release the bridge lock when calling bridge_set_ifcap()

  This calls ioctl() handlers for the different interfaces in the bridge.
  These handlers expect to get called in an ioctl context where it's safe
  for them to sleep. We may not sleep with the bridge lock held.

  However, we still need to protect the interface list, to ensure it
  doesn't get changed while we iterate over it.
  Use BRIDGE_XLOCK(), which prevents bridge members from being removed.
  Adding bridge members is safe, because it uses LIST_INSERT_HEAD().

  This caused panics when adding xen interfaces to a bridge.

  PR:		216304
  Reviewed by:	ae
  MFC after:	1 week
  Sponsored by:	RootBSD
  Differential Revision:	https://reviews.freebsd.org/D9290

Changes:
  head/sys/net/if_bridge.c
  head/sys/net/if_bridgevar.h
Comment 5 nvass 2017-01-26 21:06:16 UTC
Hi,

I am getting something similar and I am not sure whether it's related.

> panic: _mtx_lock_sleep: recursed on non-recursive mutex if_bridge @ /usr/src/sys/modules/if_bridge/../../net/if_bridge.c:2117
> 
> cpuid = 1
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00954a3260
> vpanic() at vpanic+0x186/frame 0xfffffe00954a32e0
> kassert_panic() at kassert_panic+0x126/frame 0xfffffe00954a3350
> __mtx_lock_sleep() at __mtx_lock_sleep+0x3e2/frame 0xfffffe00954a33d0
> __mtx_lock_flags() at __mtx_lock_flags+0x116/frame 0xfffffe00954a3430
> bridge_transmit() at bridge_transmit+0x61/frame 0xfffffe00954a3470
> ether_output() at ether_output+0x703/frame 0xfffffe00954a3510
> arprequest() at arprequest+0x413/frame 0xfffffe00954a3620
> arp_ifinit() at arp_ifinit+0x56/frame 0xfffffe00954a3660
> arp_handle_ifllchange() at arp_handle_ifllchange+0x3d/frame 0xfffffe00954a3680
> bridge_ioctl_add() at bridge_ioctl_add+0x3b4/frame 0xfffffe00954a36d0
> bridge_ioctl() at bridge_ioctl+0x29f/frame 0xfffffe00954a3770
> ifioctl() at ifioctl+0xbc8/frame 0xfffffe00954a37f0
> kern_ioctl() at kern_ioctl+0x2b0/frame 0xfffffe00954a3850
> sys_ioctl() at sys_ioctl+0x13f/frame 0xfffffe00954a3930
> amd64_syscall() at amd64_syscall+0x2f9/frame 0xfffffe00954a3ab0
> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe00954a3ab0
> --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800fd5aaa, rsp = 0x7fffffffe1f8, rbp = 0x7fffffffe2a0 ---
> KDB: enter: panic
> [ thread pid 770 tid 100560 ]
> Stopped at      kdb_enter+0x3b: movq    $0,kdb_why
Comment 6 Kristof Provost freebsd_committer freebsd_triage 2017-01-26 21:10:52 UTC
(In reply to nvass from comment #5)
That looks like a different problem. Can you file a new bug with the backtrace and a description of how you reproduce this? Please cc me on it.
Comment 7 nvass 2017-01-26 21:21:52 UTC
Thanks for the quick reply!

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=216510
Comment 8 commit-hook freebsd_committer freebsd_triage 2017-02-01 20:27:45 UTC
A commit references this bug:

Author: kp
Date: Wed Feb  1 20:27:38 UTC 2017
New revision: 313050
URL: https://svnweb.freebsd.org/changeset/base/313050

Log:
  MFC 312782

  bridge: Release the bridge lock when calling bridge_set_ifcap()

  This calls ioctl() handlers for the different interfaces in the bridge.
  These handlers expect to get called in an ioctl context where it's safe
  for them to sleep. We may not sleep with the bridge lock held.

  However, we still need to protect the interface list, to ensure it
  doesn't get changed while we iterate over it.
  Use BRIDGE_XLOCK(), which prevents bridge members from being removed.
  Adding bridge members is safe, because it uses LIST_INSERT_HEAD().

  This caused panics when adding xen interfaces to a bridge.

  PR:		216304
  Reviewed by:	ae
  Sponsored by:	RootBSD
  Differential Revision:	https://reviews.freebsd.org/D9290

Changes:
_U  stable/11/
  stable/11/sys/net/if_bridge.c
  stable/11/sys/net/if_bridgevar.h
Comment 9 commit-hook freebsd_committer freebsd_triage 2017-02-01 21:45:58 UTC
A commit references this bug:

Author: kp
Date: Wed Feb  1 21:44:50 UTC 2017
New revision: 313066
URL: https://svnweb.freebsd.org/changeset/base/313066

Log:
  MFC 312782

  bridge: Release the bridge lock when calling bridge_set_ifcap()

  This calls ioctl() handlers for the different interfaces in the bridge.
  These handlers expect to get called in an ioctl context where it's safe
  for them to sleep. We may not sleep with the bridge lock held.

  However, we still need to protect the interface list, to ensure it
  doesn't get changed while we iterate over it.
  Use BRIDGE_XLOCK(), which prevents bridge members from being removed.
  Adding bridge members is safe, because it uses LIST_INSERT_HEAD().

  This caused panics when adding xen interfaces to a bridge.

  PR:		216304
  Reviewed by:	ae
  Sponsored by:	RootBSD
  Differential Revision:	https://reviews.freebsd.org/D9290

Changes:
_U  stable/10/
  stable/10/sys/net/if_bridge.c
  stable/10/sys/net/if_bridgevar.h