When using if_vtnet(4) as a module and unloading it using "kldunload if_vtnet" it immediately causes a panic: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x80 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80cff56d stack pointer = 0x28:0xfffffe0096eb28a0 frame pointer = 0x28:0xfffffe0096eb28d0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 779 (kldunload) [ thread pid 779 tid 100058 ] Stopped at uma_zone_get_cur+0xd: movq 0x80(%r14),%rdi db> bt Tracing pid 779 tid 100058 td 0xfffff800064219a0 uma_zone_get_cur() at uma_zone_get_cur+0xd/frame 0xfffffe0096eb28d0 vtnet_modevent() at vtnet_modevent+0x51/frame 0xfffffe0096eb2900 module_unload() at module_unload+0x32/frame 0xfffffe0096eb2920 linker_file_unload() at linker_file_unload+0x24b/frame 0xfffffe0096eb2970 kern_kldunload() at kern_kldunload+0xbc/frame 0xfffffe0096eb29a0 amd64_syscall() at amd64_syscall+0x2db/frame 0xfffffe0096eb2ab0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0096eb2ab0 --- syscall (444, FreeBSD ELF64, sys_kldunloadf), rip = 0x80086b15a, rsp = 0x7fffffffe348, rbp = 0x7fffffffebb0 --- db>
Created attachment 170182 [details] possible fix Attaching a patch that fixes this issue for me.
That fix looked odd to me, because I did not expect vtnet_modevent() to be called twice with MOD_UNLOAD. That is the cause though, because there are two DRIVER_MODULE()s for vtnet. I think this problem was introduced with r276367. Cc-ing Andrew so he can take a look too.
It's just occurred to me that if vtnet_modevent() gets called twice when it's unloaded it probably also gets called twice when the module is loaded. That means that it leaks the first vtnet_tx_header_zone allocated with uma_zcreate().
Created attachment 170217 [details] updated patch Yeah, you're right, vtnet_modevent() is called twice, so we don't need to duplicate uma_zcreate() calls indeed.
Oh, and actually it appears to work fine if I just do: DRIVER_MODULE(vtnet, virtio_mmio, vtnet_driver, vtnet_devclass, vtnet_modevent, 0); DRIVER_MODULE(vtnet, virtio_pci, vtnet_driver, vtnet_devclass, - vtnet_modevent, 0); + 0, 0);
(In reply to Roman Bogorodskiy from comment #5) Yeah, but that's probably risky. I'm not sure the load order is guaranteed, and I'm also not sure that there's any guarantee that the probe function won't be executed until both are loaded. There's a couple other drivers with a similar setup. For example sys/dev/cxgbe/t4_main.c They've solved it with a simple counter in their mod_event function. I'd be inclined to do something similar.
(In reply to Kristof Provost from comment #6) Thanks for the pointer, I'll try to test the same approach for vtnet and upload a new patch if things go well.
Created attachment 170227 [details] patch v3 Updated the patch to use an approach similar to the one used in cxgbe. Basic testing didn't reveal any issues.
That looks correct to me, although I'm not sure if the mutex is actually required. The code calling the module load/unload function seems to do so with Giant held.
(In reply to Kristof Provost from comment #9) Ah, I think that's right: https://svnweb.freebsd.org/base/head/sys/kern/kern_module.c?revision=289111&view=markup#l260 It's not obvious why cxgbe uses it then.
I *think* cxgbe is wrong (or overly cautious). I'd propose doing it without the mutex in vtnet. You're not going to make things any worse than the currently are by not adding it.
Created attachment 170229 [details] patch v4 Uploaded v4 of the patch, without mutex. Again, basic testing didn't show any issues.
Hmm, can you verify that uma_zdestroy() still gets called? I think the order is MOD_LOAD on load, but an unload first gets MOD_QUIESCE and then MOD_UNLOAD. That'd mean we do loaded++ twice, but --loaded four times.
Created attachment 170230 [details] patch v5 Good point, I've moved out EBUSY check out of the --loader == 0 condition, as all the MOD_QUIESCE calls will be _before_ the last MOD_UNLOAD where we clean things up.
(In reply to Roman Bogorodskiy from comment #14) With some debug logging it looks like this (printf's + dmesg): load: loaded = 0 load: loaded = 1 vtnet0: <VirtIO Networking Adapter> on virtio_pci0 vtnet0: Ethernet address: 52:54:00:f5:03:43 vtnet0: link state changed to UP quiesce: loaded = 2 quiesce: loaded = 2 quiesce: loaded = 2 quiesce: loaded = 2 vtnet0: default vnet vtnet0: link state changed to DOWN vtnet0: detached unload: loaded = 1 unload: uma_zdestroy unload: loaded = 0 ("load" logs are printed _before_ changing value of 'loaded', "unload" are printed _after_). There were a couple "Device busy" statues on unload before it succeeded.
A commit references this bug: Author: kp Date: Sat May 14 06:07:15 UTC 2016 New revision: 299725 URL: https://svnweb.freebsd.org/changeset/base/299725 Log: vtnet: fix panic on unload Since r276367 added the virtio_mmio support vtnet_modevent() gets called twice. This resulted in a memory leak during load and a panic on unload. Count the loads so we only initialise once (just like cxgbe(4)), and only clean up in the final unload. PR: 209428 Submitted by: novel@FreeBSD.org MFC after: 1 week Changes: head/sys/dev/virtio/network/if_vtnet.c
Kristof, thanks for reviewing and committing this!
Thanks for the fix Roman. I was looking to MFC this, but in stable/10 there's only one DRIVER_MODULE(), so there's no problem there.
(In reply to Kristof Provost from comment #18) Thanks for helping to create a *proper* fix. If you're interested, I've noticed one more issue with if_vtnet(4): https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209427