Summary: | if_vtnet(4) panics when doing kldunload if_vtnet | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Roman Bogorodskiy <novel> | ||||||||||||
Component: | kern | Assignee: | freebsd-virtualization (Nobody) <virtualization> | ||||||||||||
Status: | Closed FIXED | ||||||||||||||
Severity: | Affects Only Me | CC: | Andrew, bryanv, kp, olevole, olevole | ||||||||||||
Priority: | --- | ||||||||||||||
Version: | CURRENT | ||||||||||||||
Hardware: | Any | ||||||||||||||
OS: | Any | ||||||||||||||
Attachments: |
|
Description
Roman Bogorodskiy
2016-05-10 15:14:13 UTC
Created attachment 170182 [details]
possible fix
Attaching a patch that fixes this issue for me.
That fix looked odd to me, because I did not expect vtnet_modevent() to be called twice with MOD_UNLOAD. That is the cause though, because there are two DRIVER_MODULE()s for vtnet. I think this problem was introduced with r276367. Cc-ing Andrew so he can take a look too. It's just occurred to me that if vtnet_modevent() gets called twice when it's unloaded it probably also gets called twice when the module is loaded. That means that it leaks the first vtnet_tx_header_zone allocated with uma_zcreate(). Created attachment 170217 [details]
updated patch
Yeah, you're right, vtnet_modevent() is called twice, so we don't need to duplicate uma_zcreate() calls indeed.
Oh, and actually it appears to work fine if I just do: DRIVER_MODULE(vtnet, virtio_mmio, vtnet_driver, vtnet_devclass, vtnet_modevent, 0); DRIVER_MODULE(vtnet, virtio_pci, vtnet_driver, vtnet_devclass, - vtnet_modevent, 0); + 0, 0); (In reply to Roman Bogorodskiy from comment #5) Yeah, but that's probably risky. I'm not sure the load order is guaranteed, and I'm also not sure that there's any guarantee that the probe function won't be executed until both are loaded. There's a couple other drivers with a similar setup. For example sys/dev/cxgbe/t4_main.c They've solved it with a simple counter in their mod_event function. I'd be inclined to do something similar. (In reply to Kristof Provost from comment #6) Thanks for the pointer, I'll try to test the same approach for vtnet and upload a new patch if things go well. Created attachment 170227 [details]
patch v3
Updated the patch to use an approach similar to the one used in cxgbe. Basic testing didn't reveal any issues.
That looks correct to me, although I'm not sure if the mutex is actually required. The code calling the module load/unload function seems to do so with Giant held. (In reply to Kristof Provost from comment #9) Ah, I think that's right: https://svnweb.freebsd.org/base/head/sys/kern/kern_module.c?revision=289111&view=markup#l260 It's not obvious why cxgbe uses it then. I *think* cxgbe is wrong (or overly cautious). I'd propose doing it without the mutex in vtnet. You're not going to make things any worse than the currently are by not adding it. Created attachment 170229 [details]
patch v4
Uploaded v4 of the patch, without mutex. Again, basic testing didn't show any issues.
Hmm, can you verify that uma_zdestroy() still gets called? I think the order is MOD_LOAD on load, but an unload first gets MOD_QUIESCE and then MOD_UNLOAD. That'd mean we do loaded++ twice, but --loaded four times. Created attachment 170230 [details]
patch v5
Good point, I've moved out EBUSY check out of the --loader == 0 condition, as all the MOD_QUIESCE calls will be _before_ the last MOD_UNLOAD where we clean things up.
(In reply to Roman Bogorodskiy from comment #14) With some debug logging it looks like this (printf's + dmesg): load: loaded = 0 load: loaded = 1 vtnet0: <VirtIO Networking Adapter> on virtio_pci0 vtnet0: Ethernet address: 52:54:00:f5:03:43 vtnet0: link state changed to UP quiesce: loaded = 2 quiesce: loaded = 2 quiesce: loaded = 2 quiesce: loaded = 2 vtnet0: default vnet vtnet0: link state changed to DOWN vtnet0: detached unload: loaded = 1 unload: uma_zdestroy unload: loaded = 0 ("load" logs are printed _before_ changing value of 'loaded', "unload" are printed _after_). There were a couple "Device busy" statues on unload before it succeeded. A commit references this bug: Author: kp Date: Sat May 14 06:07:15 UTC 2016 New revision: 299725 URL: https://svnweb.freebsd.org/changeset/base/299725 Log: vtnet: fix panic on unload Since r276367 added the virtio_mmio support vtnet_modevent() gets called twice. This resulted in a memory leak during load and a panic on unload. Count the loads so we only initialise once (just like cxgbe(4)), and only clean up in the final unload. PR: 209428 Submitted by: novel@FreeBSD.org MFC after: 1 week Changes: head/sys/dev/virtio/network/if_vtnet.c Kristof, thanks for reviewing and committing this! Thanks for the fix Roman. I was looking to MFC this, but in stable/10 there's only one DRIVER_MODULE(), so there's no problem there. (In reply to Kristof Provost from comment #18) Thanks for helping to create a *proper* fix. If you're interested, I've noticed one more issue with if_vtnet(4): https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209427 |