Bug 253960

Summary: panic when destroying a vnet with a wg interface
Product: Base System Reporter: Mark Johnston <markj>
Component: kernAssignee: Mark Johnston <markj>
Status: Closed FIXED    
Severity: Affects Only Me CC: jason, kevans
Priority: ---    
Version: Unspecified   
Hardware: Any   
OS: Any   

Description Mark Johnston freebsd_committer 2021-03-02 17:36:55 UTC
# kldload if_wg
# jail -c name=test persist vnet path=/
# jexec test ifconfig wg0 create listen-port 54321 private-key `openssl rand -base64 32`
# jail -r test
panic: vnet_if_uninit:468 tailq &V_ifnet=0xfffffe01ae303070 not empty
cpuid = 20
time = 1614706375
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe01aa80e930
vpanic() at vpanic+0x181/frame 0xfffffe01aa80e980
panic() at panic+0x43/frame 0xfffffe01aa80e9e0
vnet_if_uninit() at vnet_if_uninit+0x7b/frame 0xfffffe01aa80e9f0
vnet_destroy() at vnet_destroy+0x160/frame 0xfffffe01aa80ea20
prison_deref() at prison_deref+0x96c/frame 0xfffffe01aa80ea90
sys_jail_remove() at sys_jail_remove+0x119/frame 0xfffffe01aa80eac0
amd64_syscall() at amd64_syscall+0x12e/frame 0xfffffe01aa80ebf0
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe01aa80ebf0
--- syscall (508, FreeBSD ELF64, sys_jail_remove), rip = 0x8003c507a, rsp = 0x7fffffffe858, rbp = 0x7fffffffe8e0 ---
Comment 1 Mark Johnston freebsd_committer 2021-03-05 21:10:09 UTC
The iflib pseudo cloner subsystem is supposed to tear down interfaces when a vnet is destroyed, but it failed to do so.  I have a working patch but right now it's asymmetric in that iflib consumers have to virtualize the pseudo cloner while iflib handles destruction.  It'd be nice to have iflib handle all of it but I don't quite see how yet.
Comment 2 Mark Johnston freebsd_committer 2021-03-10 00:41:44 UTC
Urgh.  The iflib cloner creates a device for each interface and expects to be able to use the ifnet's unit number as the device's unit number, but ifnet unit numbers are virtualized while device unit numbers are not.
Comment 3 Mark Johnston freebsd_committer 2021-03-10 00:51:27 UTC
I guess there's no real reason the device unit number and ifnet unit number have to line up, right...?
Comment 4 Kyle Evans freebsd_committer 2021-03-10 01:54:13 UTC
(In reply to Mark Johnston from comment #3)

They don't have to match, no, but we'll definitely want to break out an ifname identifier under the device for correlation purposes. I think it's OK to proceed under that assumption, because it'd be more jarring to enforce unique interface numbering across vnets.
Comment 5 Mark Johnston freebsd_committer 2021-03-10 02:34:41 UTC
(In reply to Kyle Evans from comment #4)
As far as I understand the device exists solely to satisfy iflib's internal interfaces. If that's true the longer term solution is to refactor it such that that's no longer necessary for software interfaces.
Comment 6 Mark Johnston freebsd_committer 2021-03-10 16:20:52 UTC
iflib expects the device and ifnet unit numbers to line up, so I think the only option for 13.0 is to add a global unit number allocator to iflib and just live with the fact that unit numbers have to be unique across all vnets.
Comment 7 Kyle Evans freebsd_committer 2021-03-10 16:26:54 UTC
(In reply to Mark Johnston from comment #6)

OK, that's a reasonable compromise; we should slip a note to that effect in BUGS section of wg(4).
Comment 8 Mark Johnston freebsd_committer 2021-03-15 05:24:08 UTC
Should be fixed in head by https://cgit.freebsd.org/src/commit/?id=74ae3f3e33b810248da19004c58b3581cd367843 , thanks Kyle!
Comment 9 Jason A. Donenfeld 2021-04-17 23:51:23 UTC
This should be fixed in net/wireguard-kmod, so this bug report can be closed.