Bug 213015 - openvswitch and vnet jails - panic when bridge is destroyed and recreated
Summary: openvswitch and vnet jails - panic when bridge is destroyed and recreated
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.0-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: Andrey V. Elsukov
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-27 04:52 UTC by akoshibe
Modified: 2017-03-20 08:17 UTC (History)
3 users (show)

See Also:


Attachments
test case, two jails and bridge (962 bytes, application/x-shellscript)
2016-09-27 04:52 UTC, akoshibe
no flags Details
test - ignore interfaces being renamed (330 bytes, patch)
2017-02-28 08:23 UTC, akoshibe
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description akoshibe 2016-09-27 04:52:04 UTC
Created attachment 175191 [details]
test case, two jails and bridge

When I create a few jails and connect them together with an openvswitch bridge, I can fairly reliably cause a panic by tearing that bridge down and recreating another immediately after, if the previous bridge had seen traffic.

Unread portion of the kernel message buffer:
instruction pointer     = 0x20:0xffffffff80be7b9c
stack pointer           = 0x28:0xfffffe00002a8700
frame pointer           = 0x28:0xfffffe00002a8770
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 799 (handler52)
trap number             = 12
panic: page fault
cpuid = 1
KDB: stack backtrace:
#0 0xffffffff80b26377 at kdb_backtrace+0x67
#1 0xffffffff80adae02 at vpanic+0x182
#2 0xffffffff80adac73 at panic+0x43
#3 0xffffffff80fc8d51 at trap_fatal+0x351
#4 0xffffffff80fc8f43 at trap_pfault+0x1e3
#5 0xffffffff80fc84cc at trap+0x26c
#6 0xffffffff80fab5f1 at calltrap+0x8
#7 0xffffffff80bfefff at netisr_dispatch_src+0xff
#8 0xffffffff80be7384 at ether_input+0x54
#9 0xffffffff82419f69 at tapwrite+0x139
#10 0xffffffff809873f7 at devfs_write_f+0xe7
#11 0xffffffff80b435a7 at dofilewrite+0x87
#12 0xffffffff80b43288 at kern_writev+0x68
#13 0xffffffff80b43214 at sys_write+0x84
#14 0xffffffff80fc96b8 at amd64_syscall+0x4d8
#15 0xffffffff80fab8db at Xfast_syscall+0xfb
Uptime: 2m20s
Dumping 112 out of 991 MB:..15%..29%..43%..57%..72%..86%..100%

Reading symbols from /boot/kernel/if_tap.ko...Reading symbols from /usr/lib/debug//boot/kernel/if_tap.ko.debug...done.
done.
Loaded symbols for /boot/kernel/if_tap.ko
Reading symbols from /boot/kernel/if_epair.ko...Reading symbols from /usr/lib/debug//boot/kernel/if_epair.ko.debug...done.
done.
Loaded symbols for /boot/kernel/if_epair.ko
#0  doadump (textdump=<value optimized out>) at pcpu.h:221
221             __asm("movq %%gs:%1,%0" : "=r" (td)
(kgdb) bt
#0  doadump (textdump=<value optimized out>) at pcpu.h:221
#1  0xffffffff80ada889 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:366
#2  0xffffffff80adae3b in vpanic (fmt=<value optimized out>, ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:759
#3  0xffffffff80adac73 in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:690
#4  0xffffffff80fc8d51 in trap_fatal (frame=0xfffffe00002a8650, eva=16) at /usr/src/sys/amd64/amd64/trap.c:841
#5  0xffffffff80fc8f43 in trap_pfault (frame=0xfffffe00002a8650, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:691
#6  0xffffffff80fc84cc in trap (frame=0xfffffe00002a8650) at /usr/src/sys/amd64/amd64/trap.c:442
#7  0xffffffff80fab5f1 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236
#8  0xffffffff80be7b9c in ether_nh_input (m=<value optimized out>) at /usr/src/sys/net/if_ethersubr.c:517
#9  0xffffffff80bfefff in netisr_dispatch_src (proto=5, source=<value optimized out>, m=0xfffff8009943d4d8) at /usr/src/sys/net/netisr.c:1120
#10 0xffffffff80be7384 in ether_input (ifp=<value optimized out>, m=0x0) at /usr/src/sys/net/if_ethersubr.c:759
#11 0xffffffff82419f69 in tapwrite (dev=<value optimized out>, uio=<value optimized out>, flag=<value optimized out>)
    at /usr/src/sys/modules/if_tap/../../net/if_tap.c:975
#12 0xffffffff809873f7 in devfs_write_f (fp=<value optimized out>, uio=<value optimized out>, cred=<value optimized out>, 
    flags=<value optimized out>, td=0xfffff8001b210000) at /usr/src/sys/fs/devfs/devfs_vnops.c:1759
#13 0xffffffff80b435a7 in dofilewrite (td=0xfffff8001b210000, fd=27, fp=0xfffff80003920e10, auio=0xfffffe00002a8960, 
    offset=<value optimized out>, flags=0) at file.h:311
#14 0xffffffff80b43288 in kern_writev (td=0xfffff8001b210000, fd=27, auio=0xfffffe00002a8960) at /usr/src/sys/kern/sys_generic.c:506
#15 0xffffffff80b43214 in sys_write (td=0xfffff8001b1c1800, uap=<value optimized out>) at /usr/src/sys/kern/sys_generic.c:419
#16 0xffffffff80fc96b8 in amd64_syscall (td=<value optimized out>, traced=0) at subr_syscall.c:135
#17 0xffffffff80fab8db in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:396
#18 0x0000000801c1371a in ?? ()
Previous frame inner to this frame (corrupt stack?)

The kernel configuration:

include GENERIC
ident VIMAGEMOD

options VIMAGE
options DUMMYNET
options HZ=1000


Attaching a script that triggers the panic for me in about three or so runs.
Comment 1 Palle Girgensohn freebsd_committer 2016-09-27 07:22:13 UTC
I've hade exaclty the same problems with epair, switched to netgraph instead which has proven rock solid.

I know there is a project to improve this area as well, can't remember frmo the top of my head who is working on it. Don't think it has hit the source tree yet though.

For epair problems, see for example this thread:
https://forums.freebsd.org/threads/31765/

If all you want is networking running in jails, I can document the procedure we use, using netgraph (not epoair). This is very solid.

Epair+vimage has alsway had this problem when tearing down the jail, sadly.

Palle
Comment 2 akoshibe 2016-09-27 23:11:15 UTC
(In reply to Palle Girgensohn from comment #1)

I've noticed that I won't trigger a panic if, keeping everything else the same, I omit sending traffic (e.g. the one ping in the test script) or replace openvswitch with if_bridge. Hence, I'm also wondering if it's something about what openvswitch does with its tap interface.

If there is a solid way to do jails with networking and netgraph, that is something that I would like to take a look at, and would appreciate pointers for...
Comment 3 Bjoern A. Zeeb freebsd_committer 2016-09-27 23:28:15 UTC
(In reply to Palle Girgensohn from comment #1)

That forums thread is quite old;  things should have improved for 11.
Comment 4 Bjoern A. Zeeb freebsd_committer 2016-09-27 23:30:10 UTC
(In reply to akoshibe from comment #2)

When in your shell script does the panic happen?  Do you know?   I wonder if it's before the ifconfig commands.

In general I wonder if the OVS buffers packets and does a deferred transmit, e.g. like the netisr;  in that case on del one would have to cleanup the queue.
Comment 5 akoshibe 2016-09-28 00:02:45 UTC
(In reply to Bjoern A. Zeeb from comment #4)

I'm suspecting that the panic occurs when I create the bridge for the second time. I'm going to try to check if that's the case later today.
Comment 6 Palle Girgensohn freebsd_committer 2016-09-28 00:08:26 UTC
(In reply to Bjoern A. Zeeb from comment #3)

Mmm, indeed it is. But I haven't seen that much action about epair lately, has it really been improved enough? The described problem is identical with what we experienced back then.
Comment 7 akoshibe 2016-09-28 15:40:30 UTC
(In reply to akoshibe from comment #5)
Looking more closely, the panic is during the first time ovs-vsctl is called in the script (after a previous uneventful run). The last lines I see in dmesg prior to a panic are:

epair0a: Ethernet address: 02:ff:50:00:03:0a
epair0b: Ethernet address: 02:ff:a0:00:05:0b
<5>epair0a: link state changed to UP
<5>epair0b: link state changed to UP
epair1a: Ethernet address: 02:ff:50:00:06:0a
epair1b: Ethernet address: 02:ff:a0:00:07:0b
<5>epair1a: link state changed to UP
<5>epair1b: link state changed to UP
<6>epair1a: permanently promiscuous mode enabled
<6>epair0a: permanently promiscuous mode enabled
tap1: Ethernet address: 00:bd:4e:02:f9:01
<5>tap1: link state changed to UP
<6>tap1: changing name to 'vbr0'
<6>vbr0: permanently promiscuous mode enabled

and current process points to ovs-vswitchd.
Comment 8 akoshibe 2017-02-28 08:23:25 UTC
Created attachment 180356 [details]
test - ignore interfaces being renamed
Comment 9 akoshibe 2017-02-28 08:24:43 UTC
I finally had the chance to dig around more, myself.

Open vSwitch seems to try to write something when a new switch is being created with ports. It would sometimes do so while the tap device is in the process of being renamed, and ifp->if_bpf is null.

Preventing the departure handler from setting if_bpf to null during renames stopped the panic:


Index: net/bpf.c
===================================================================
--- net/bpf.c   (revision 313973)
+++ net/bpf.c   (working copy)
@@ -2678,6 +2678,9 @@
        struct bpf_if *bp, *bp_temp;
        int nmatched = 0;
 
+       if (ifp->if_flags & IFF_RENAMING)
+               return;
+
        BPF_LOCK();
        /*
         * Find matching entries in free list.
Comment 10 commit-hook freebsd_committer 2017-03-13 09:04:35 UTC
A commit references this bug:

Author: ae
Date: Mon Mar 13 09:04:10 UTC 2017
New revision: 315192
URL: https://svnweb.freebsd.org/changeset/base/315192

Log:
  Ignore ifnet renaming in the bpf ifnet departure handler.

  PR:		213015
  MFC after:	1 week

Changes:
  head/sys/net/bpf.c
Comment 11 commit-hook freebsd_committer 2017-03-20 08:11:58 UTC
A commit references this bug:

Author: ae
Date: Mon Mar 20 08:10:58 UTC 2017
New revision: 315624
URL: https://svnweb.freebsd.org/changeset/base/315624

Log:
  MFC r315192:
    Ignore ifnet renaming in the bpf ifnet departure handler.

    PR:		213015

Changes:
_U  stable/11/
  stable/11/sys/net/bpf.c
Comment 12 commit-hook freebsd_committer 2017-03-20 08:17:04 UTC
A commit references this bug:

Author: ae
Date: Mon Mar 20 08:16:05 UTC 2017
New revision: 315625
URL: https://svnweb.freebsd.org/changeset/base/315625

Log:
  MFC r315192:
    Ignore ifnet renaming in the bpf ifnet departure handler.

    PR:		213015

Changes:
_U  stable/10/
  stable/10/sys/net/bpf.c