Summary: | iflib doesn't initialize multi-queue netmap properly | ||
---|---|---|---|
Product: | Base System | Reporter: | Brian Poole <brian90013> |
Component: | kern | Assignee: | Vincenzo Maffione <vmaffione> |
Status: | Closed FIXED | ||
Severity: | Affects Only Me | CC: | vmaffione |
Priority: | --- | Flags: | vmaffione:
maintainer-feedback+
vmaffione: mfc-stable12+ vmaffione: mfc-stable11- |
Version: | 12.2-RELEASE | ||
Hardware: | amd64 | ||
OS: | Any | ||
Bug Depends on: | 252518 | ||
Bug Blocks: |
Description
Brian Poole
2021-01-05 22:25:02 UTC
Hi, The behaviour you observed (resets on any netmap rings already opened or being opened) is the expected one considering how netmap works in general right now. Any time you open or close any rings of a given NIC, the netmap register routine is called. The purpose of this routine is to move the rings to netmap mode or back to normal mode. To do so, the interface is brought down, reinitialized and brought up again, similarly to what happens in case of interface ioctl. Now, since the interface is reinitialized, all the hardware rings are deallocated, and allocated/initialized again. This is why netmap_reset is called on both queues: initialization is part of the allocation phase (in the down-up cycle), which affects all the hardware rings. Now, the possibility to open only a subset of the available rings is something that was introduced to allow for a different use-case than yours: that is, for cases where you want to process only some queues in netmap mode (e.g. queues where you direct a DDoS attack by means of the hardware flow director), and leave the others attached to the host stack (normal mode). In your use case, on the other hand, you want two non-cooperating processes to open two different (non-overlapping) subsets of the available rings of a same NIC, and use both of them in netmap mode. Because of the netmap register behavior described above, this use case is not actually supported right now. You can have more processes using different subsets of the available rings, but they must be cooperating, and thus aware of each other. For example, you could have a single process to open many different subsets of the rings of a same NIC, without starting to operate on them (thus no poll, no *xsync); once all the nm_open (or equivalent) are done, you could pass the corresponding file descriptors to other processes that will do the actual processing. This is roughly what pkt-gen with -p N (N>1) does, except the processing is done by threads and not by separate processes. This said: - If you get a panic, that's a bug that we must fix. If you have a stack trace, it would be useful to attach it. - There is a protection against this situation (basically, your examples), that is already implemented on Linux, that at least generates an error for the first process when the second process tries to open some rings on the same NIC while the first is using poll() or *xsync(). This is not in FreeBSD yet, so I will try to add it. - There may be ways to actually support non-cooperating processes opening different subsets of the available rings of the same NIC, but we need to see if that's actually possible. In the meanwhile I would suggest sticking with the pkt-gen -p approach. cxgbe implements netmap in a different way (different from any other FreeBSD driver and Linux driver), but I don't know the details so I should also look into that. Cheers Vincenzo Apparently, cxgbe uses a simpler and very effective approach, where netmap rings are mapped to hw queues dedicated to netmap, and thus never used by the host stack. So there is no need to "switch" hw queues from netmap mode and normal mode or the other way around. For example, if the user asks for cc0-3, a new hardware queue (for both TX and RX) is allocated on the fly and mapped to the fourth netmap ring. And only that queue gets reset, because the netmap register callback does not need a full down-up cycle. Unfortunately, this approach requires proper support on the NIC side, or at least a static partition between the queues that are dedicated to the normal network stack operation, and the queues that are dedicated to netmap. Also, it can only work for NICs with multiple queues. Thank you for your response! My initial testing with multiple queues was done with the cxgbe driver so I didn't realize it was a special case. My use case is normally multiple threads in one process so I will try your approach to open all queues at the beginning then dole out file descriptors. I read this suggestion in the netmap GitHub issue 730 but realized it wasn't necessary on the cxgbe. I have documented the two panics I see in iflib/ice when shutting down pkt-gen with multiple threads as bug #252518. A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=1d238b07d5d4d9660ae0e08daede6da7e91c7853 commit 1d238b07d5d4d9660ae0e08daede6da7e91c7853 Author: Vincenzo Maffione <vmaffione@FreeBSD.org> AuthorDate: 2021-01-09 20:54:11 +0000 Commit: Vincenzo Maffione <vmaffione@FreeBSD.org> CommitDate: 2021-01-09 21:01:46 +0000 netmap: iflib: stop krings during interface reset When different processes open separate subsets of the available rings of a same netmap interface, a device reset may be performed while one of the processes is actively using some rings (e.g., caused by another process executing a nmport_open()). With this patch, such situation will cause the active process to get a POLLERR, so that it can have a chance to detect the situation. We also guarantee that no process is running a txsync or rxsync (ioctl or poll) while an iflib device reset is in progress. PR: 252453 MFC after: 1 week sys/dev/netmap/netmap.c | 2 +- sys/net/iflib.c | 9 +++++++++ 2 files changed, 10 insertions(+), 1 deletion(-) A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=3d65fd97e85ab807f3baa62aa979ae527914a3ea commit 3d65fd97e85ab807f3baa62aa979ae527914a3ea Author: Vincenzo Maffione <vmaffione@FreeBSD.org> AuthorDate: 2021-01-10 12:00:30 +0000 Commit: Vincenzo Maffione <vmaffione@FreeBSD.org> CommitDate: 2021-01-10 12:04:08 +0000 netmap: iflib: enable/disable krings on any interface reinit Since 1d238b07d5d4d9660ae0e0, krings are disabled before a reinit cycle triggered by iflib_netmap_register. However, this operation is actually necessary also for any interface reinit triggered by other causes (i.e., ifconfig commands). We achieve this goal by moving the krings enable/disable operation inside iflib_stop() and iflib_init_locked(). Once here, this change also removes some redundant operations from iflib_netmap_register(), that are already performed by iflib_stop(). PR: 252453 MFC after: 1 week sys/net/iflib.c | 35 ++++++++++++++++++++--------------- 1 file changed, 20 insertions(+), 15 deletions(-) A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=bb714db6d39583a9fbf5d11849c5e2365e7c0d80 commit bb714db6d39583a9fbf5d11849c5e2365e7c0d80 Author: Vincenzo Maffione <vmaffione@FreeBSD.org> AuthorDate: 2021-01-10 14:09:00 +0000 Commit: Vincenzo Maffione <vmaffione@FreeBSD.org> CommitDate: 2021-01-10 14:10:09 +0000 netmap: vtnet: enable/disable krings on any interface reinit See 3d65fd97e85ab807f3b for a detailed explanation. PR: 252453 MFC after: 1 week sys/dev/netmap/if_vtnet_netmap.h | 6 ------ sys/dev/netmap/netmap.c | 4 ++++ sys/dev/virtio/network/if_vtnet.c | 10 ++++++++++ 3 files changed, 14 insertions(+), 6 deletions(-) A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=55f0ad5fdee1a779d6889481ba591a819081b9ca commit 55f0ad5fdee1a779d6889481ba591a819081b9ca Author: Vincenzo Maffione <vmaffione@FreeBSD.org> AuthorDate: 2021-01-10 22:49:37 +0000 Commit: Vincenzo Maffione <vmaffione@FreeBSD.org> CommitDate: 2021-01-10 22:51:15 +0000 netmap: restore hwofs and support it in iflib Restore the hwofs functionality temporarily disabled by 7ba6ecf216fb15e8b147db2 to prevent issues with iflib. This patch brings the necessary changes to iflib to enable howfs to allow interface restarts without disrupting netmap applications actively using its rings. After this change, it becomes possible for multiple non-cooperating netmap applications to use non-overlapping subsets of the available netmap rings without clashing with each other. PR: 252453 MFC after: 1 week sys/dev/netmap/netmap.c | 47 ++++++++++++++++++++++++++++++++++------------- sys/net/iflib.c | 40 +++++++++++++++++++++++++++------------- 2 files changed, 61 insertions(+), 26 deletions(-) With the latest changes to HEAD, the non-overlapping non-cooperating processes use-case should be possible with iflib. Could you please have a try? Thank you very much for your help! I spent parts of my weekend thinking how I would adjust my code to create all the rings in one place then distribute the handles. I was surprised this morning to see you had solved the issue on the driver side. Excellent work! I ran multiple threads and processes, each connecting to a distinct ring of the same interface, and observed no unusual behavior. It may be still cleaner to have the nmport_open() done by the same thread/process, before processing starts, also because the link may flap during the opening phase. But that's your call. A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=9882b83132007acc0e6cef6d248fb12cdebba278 commit 9882b83132007acc0e6cef6d248fb12cdebba278 Author: Vincenzo Maffione <vmaffione@FreeBSD.org> AuthorDate: 2021-01-09 20:54:11 +0000 Commit: Vincenzo Maffione <vmaffione@FreeBSD.org> CommitDate: 2021-01-17 12:02:44 +0000 netmap: iflib: stop krings during interface reset When different processes open separate subsets of the available rings of a same netmap interface, a device reset may be performed while one of the processes is actively using some rings (e.g., caused by another process executing a nmport_open()). With this patch, such situation will cause the active process to get a POLLERR, so that it can have a chance to detect the situation. We also guarantee that no process is running a txsync or rxsync (ioctl or poll) while an iflib device reset is in progress. PR: 252453 MFC after: 1 week (cherry picked from commit 1d238b07d5d4d9660ae0e08daede6da7e91c7853) netmap: iflib: fix asserts in netmap_fl_refill() When netmap_fl_refill() is called at initialization time (e.g., during netmap_iflib_register()), nic_i must be 0, since the free list is reinitialized. At the end of the refill cycle, nic_i must still be zero, because exactly N descriptors (N is the ring size) are refilled. This patch therefore fixes the assertions to check on nic_i rather than on nm_i. The current netmap_reset() may in fact cause nm_i to be != 0 while the device is resetting: this may happen when multiple non-cooperating processes open different subsets of the available netmap rings. PR: 252518 MFC after: 1 week (cherry picked from commit 3189ba61673af5bd3ed2ee9e33be92bc0b9995ba) netmap: refactor netmap_reset The netmap_reset() function is meant to be called by the driver when they initialize (or re-initialize) a hardware ring. However, since the introduction of support for opening (in netmap mode) a subset of the available rings, netmap_reset() may be called multiple times on actively used rings, causing both kring and netmap ring to transition to an inconsistent state. This changes improves the situation by resetting all the indices fields of the kring to 0, as expected after the reinitialization of a hardware ring. PR: 252518 MFC after: 1 week (cherry picked from commit 7ba6ecf216fb15e8b147db268b91d9b82c5a0682) netmap: iflib: enable/disable krings on any interface reinit Since 1d238b07d5d4d9660ae0e0, krings are disabled before a reinit cycle triggered by iflib_netmap_register. However, this operation is actually necessary also for any interface reinit triggered by other causes (i.e., ifconfig commands). We achieve this goal by moving the krings enable/disable operation inside iflib_stop() and iflib_init_locked(). Once here, this change also removes some redundant operations from iflib_netmap_register(), that are already performed by iflib_stop(). PR: 252453 MFC after: 1 week (cherry picked from commit 3d65fd97e85ab807f3baa62aa979ae527914a3ea) iflib: fix build failure in case DEV_NETMAP is not defined This addresses the build failure introduced by 3d65fd97e85ab807f3baa62. MFC with: 3d65fd97e85ab807f3baa62 (cherry picked from commit 8aa8484cbf06bb26ad87177fe4aebcaf1519f138) netmap: restore hwofs and support it in iflib Restore the hwofs functionality temporarily disabled by 7ba6ecf216fb15e8b147db2 to prevent issues with iflib. This patch brings the necessary changes to iflib to enable howfs to allow interface restarts without disrupting netmap applications actively using its rings. After this change, it becomes possible for multiple non-cooperating netmap applications to use non-overlapping subsets of the available netmap rings without clashing with each other. PR: 252453 MFC after: 1 week (cherry picked from commit 55f0ad5fdee1a779d6889481ba591a819081b9ca) sys/dev/netmap/netmap.c | 94 +++++++++++++++++++++++-------------------------- sys/net/iflib.c | 67 +++++++++++++++++++++++++---------- 2 files changed, 94 insertions(+), 67 deletions(-) A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=9882b83132007acc0e6cef6d248fb12cdebba278 commit 9882b83132007acc0e6cef6d248fb12cdebba278 Author: Vincenzo Maffione <vmaffione@FreeBSD.org> AuthorDate: 2021-01-09 20:54:11 +0000 Commit: Vincenzo Maffione <vmaffione@FreeBSD.org> CommitDate: 2021-01-17 12:02:44 +0000 netmap: iflib: stop krings during interface reset When different processes open separate subsets of the available rings of a same netmap interface, a device reset may be performed while one of the processes is actively using some rings (e.g., caused by another process executing a nmport_open()). With this patch, such situation will cause the active process to get a POLLERR, so that it can have a chance to detect the situation. We also guarantee that no process is running a txsync or rxsync (ioctl or poll) while an iflib device reset is in progress. PR: 252453 MFC after: 1 week (cherry picked from commit 1d238b07d5d4d9660ae0e08daede6da7e91c7853) netmap: iflib: fix asserts in netmap_fl_refill() When netmap_fl_refill() is called at initialization time (e.g., during netmap_iflib_register()), nic_i must be 0, since the free list is reinitialized. At the end of the refill cycle, nic_i must still be zero, because exactly N descriptors (N is the ring size) are refilled. This patch therefore fixes the assertions to check on nic_i rather than on nm_i. The current netmap_reset() may in fact cause nm_i to be != 0 while the device is resetting: this may happen when multiple non-cooperating processes open different subsets of the available netmap rings. PR: 252518 MFC after: 1 week (cherry picked from commit 3189ba61673af5bd3ed2ee9e33be92bc0b9995ba) netmap: refactor netmap_reset The netmap_reset() function is meant to be called by the driver when they initialize (or re-initialize) a hardware ring. However, since the introduction of support for opening (in netmap mode) a subset of the available rings, netmap_reset() may be called multiple times on actively used rings, causing both kring and netmap ring to transition to an inconsistent state. This changes improves the situation by resetting all the indices fields of the kring to 0, as expected after the reinitialization of a hardware ring. PR: 252518 MFC after: 1 week (cherry picked from commit 7ba6ecf216fb15e8b147db268b91d9b82c5a0682) netmap: iflib: enable/disable krings on any interface reinit Since 1d238b07d5d4d9660ae0e0, krings are disabled before a reinit cycle triggered by iflib_netmap_register. However, this operation is actually necessary also for any interface reinit triggered by other causes (i.e., ifconfig commands). We achieve this goal by moving the krings enable/disable operation inside iflib_stop() and iflib_init_locked(). Once here, this change also removes some redundant operations from iflib_netmap_register(), that are already performed by iflib_stop(). PR: 252453 MFC after: 1 week (cherry picked from commit 3d65fd97e85ab807f3baa62aa979ae527914a3ea) iflib: fix build failure in case DEV_NETMAP is not defined This addresses the build failure introduced by 3d65fd97e85ab807f3baa62. MFC with: 3d65fd97e85ab807f3baa62 (cherry picked from commit 8aa8484cbf06bb26ad87177fe4aebcaf1519f138) netmap: restore hwofs and support it in iflib Restore the hwofs functionality temporarily disabled by 7ba6ecf216fb15e8b147db2 to prevent issues with iflib. This patch brings the necessary changes to iflib to enable howfs to allow interface restarts without disrupting netmap applications actively using its rings. After this change, it becomes possible for multiple non-cooperating netmap applications to use non-overlapping subsets of the available netmap rings without clashing with each other. PR: 252453 MFC after: 1 week (cherry picked from commit 55f0ad5fdee1a779d6889481ba591a819081b9ca) sys/dev/netmap/netmap.c | 94 +++++++++++++++++++++++-------------------------- sys/net/iflib.c | 67 +++++++++++++++++++++++++---------- 2 files changed, 94 insertions(+), 67 deletions(-) A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=9882b83132007acc0e6cef6d248fb12cdebba278 commit 9882b83132007acc0e6cef6d248fb12cdebba278 Author: Vincenzo Maffione <vmaffione@FreeBSD.org> AuthorDate: 2021-01-09 20:54:11 +0000 Commit: Vincenzo Maffione <vmaffione@FreeBSD.org> CommitDate: 2021-01-17 12:02:44 +0000 netmap: iflib: stop krings during interface reset When different processes open separate subsets of the available rings of a same netmap interface, a device reset may be performed while one of the processes is actively using some rings (e.g., caused by another process executing a nmport_open()). With this patch, such situation will cause the active process to get a POLLERR, so that it can have a chance to detect the situation. We also guarantee that no process is running a txsync or rxsync (ioctl or poll) while an iflib device reset is in progress. PR: 252453 MFC after: 1 week (cherry picked from commit 1d238b07d5d4d9660ae0e08daede6da7e91c7853) netmap: iflib: fix asserts in netmap_fl_refill() When netmap_fl_refill() is called at initialization time (e.g., during netmap_iflib_register()), nic_i must be 0, since the free list is reinitialized. At the end of the refill cycle, nic_i must still be zero, because exactly N descriptors (N is the ring size) are refilled. This patch therefore fixes the assertions to check on nic_i rather than on nm_i. The current netmap_reset() may in fact cause nm_i to be != 0 while the device is resetting: this may happen when multiple non-cooperating processes open different subsets of the available netmap rings. PR: 252518 MFC after: 1 week (cherry picked from commit 3189ba61673af5bd3ed2ee9e33be92bc0b9995ba) netmap: refactor netmap_reset The netmap_reset() function is meant to be called by the driver when they initialize (or re-initialize) a hardware ring. However, since the introduction of support for opening (in netmap mode) a subset of the available rings, netmap_reset() may be called multiple times on actively used rings, causing both kring and netmap ring to transition to an inconsistent state. This changes improves the situation by resetting all the indices fields of the kring to 0, as expected after the reinitialization of a hardware ring. PR: 252518 MFC after: 1 week (cherry picked from commit 7ba6ecf216fb15e8b147db268b91d9b82c5a0682) netmap: iflib: enable/disable krings on any interface reinit Since 1d238b07d5d4d9660ae0e0, krings are disabled before a reinit cycle triggered by iflib_netmap_register. However, this operation is actually necessary also for any interface reinit triggered by other causes (i.e., ifconfig commands). We achieve this goal by moving the krings enable/disable operation inside iflib_stop() and iflib_init_locked(). Once here, this change also removes some redundant operations from iflib_netmap_register(), that are already performed by iflib_stop(). PR: 252453 MFC after: 1 week (cherry picked from commit 3d65fd97e85ab807f3baa62aa979ae527914a3ea) iflib: fix build failure in case DEV_NETMAP is not defined This addresses the build failure introduced by 3d65fd97e85ab807f3baa62. MFC with: 3d65fd97e85ab807f3baa62 (cherry picked from commit 8aa8484cbf06bb26ad87177fe4aebcaf1519f138) netmap: restore hwofs and support it in iflib Restore the hwofs functionality temporarily disabled by 7ba6ecf216fb15e8b147db2 to prevent issues with iflib. This patch brings the necessary changes to iflib to enable howfs to allow interface restarts without disrupting netmap applications actively using its rings. After this change, it becomes possible for multiple non-cooperating netmap applications to use non-overlapping subsets of the available netmap rings without clashing with each other. PR: 252453 MFC after: 1 week (cherry picked from commit 55f0ad5fdee1a779d6889481ba591a819081b9ca) sys/dev/netmap/netmap.c | 94 +++++++++++++++++++++++-------------------------- sys/net/iflib.c | 67 +++++++++++++++++++++++++---------- 2 files changed, 94 insertions(+), 67 deletions(-) A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=fa1a4ff944f4b3e841df0cdd67c6373a589ffdd5 commit fa1a4ff944f4b3e841df0cdd67c6373a589ffdd5 Author: Vincenzo Maffione <vmaffione@FreeBSD.org> AuthorDate: 2021-01-09 22:34:10 +0000 Commit: Vincenzo Maffione <vmaffione@FreeBSD.org> CommitDate: 2021-01-17 13:38:09 +0000 netmap: vtnet: stop krings during interface reset Similarly to what done for iflib in 1d238b07d5d4d9660ae0e, this patch prevents access to the krings during the interface reset triggered by netmap_register(). MFC after: 1 week (cherry picked from commit 9ac59d42c0b4b6cd9c36a5dace7f49753c2e175a) netmap: vtnet: enable/disable krings on any interface reinit See 3d65fd97e85ab807f3b for a detailed explanation. PR: 252453 MFC after: 1 week (cherry picked from commit bb714db6d39583a9fbf5d11849c5e2365e7c0d80) netmap: vtnet: fix RX initialization after netmap_reset() At device reset, we must not publish those netmap receive buffers that are owned by userspace (nm_kr_rxspace). MFC after: 1 week (cherry picked from commit 3005e10ddbfbec3ecf46a080607bb0d85986eee5) sys/dev/netmap/if_vtnet_netmap.h | 21 +++++++++++++-------- sys/dev/netmap/netmap.c | 4 ++++ sys/dev/virtio/network/if_vtnet.c | 10 ++++++++++ 3 files changed, 27 insertions(+), 8 deletions(-) |