Summary: | iflib/ice panics when netmap/pkt-gen exits | ||
---|---|---|---|
Product: | Base System | Reporter: | Brian Poole <brian90013> |
Component: | kern | Assignee: | Vincenzo Maffione <vmaffione> |
Status: | Closed FIXED | ||
Severity: | Affects Only Me | CC: | emaste, vmaffione |
Priority: | --- | Flags: | vmaffione:
maintainer-feedback+
vmaffione: mfc-stable12+ |
Version: | 12.2-RELEASE | ||
Hardware: | amd64 | ||
OS: | Any | ||
Bug Depends on: | |||
Bug Blocks: | 252453 |
Description
Brian Poole
2021-01-08 14:33:27 UTC
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=3189ba61673af5bd3ed2ee9e33be92bc0b9995ba commit 3189ba61673af5bd3ed2ee9e33be92bc0b9995ba Author: Vincenzo Maffione <vmaffione@FreeBSD.org> AuthorDate: 2021-01-09 21:35:07 +0000 Commit: Vincenzo Maffione <vmaffione@FreeBSD.org> CommitDate: 2021-01-09 21:35:07 +0000 netmap: iflib: fix asserts in netmap_fl_refill() When netmap_fl_refill() is called at initialization time (e.g., during netmap_iflib_register()), nic_i must be 0, since the free list is reinitialized. At the end of the refill cycle, nic_i must still be zero, because exactly N descriptors (N is the ring size) are refilled. This patch therefore fixes the assertions to check on nic_i rather than on nm_i. The current netmap_reset() may in fact cause nm_i to be != 0 while the device is resetting: this may happen when multiple non-cooperating processes open different subsets of the available netmap rings. PR: 252518 MFC after: 1 week sys/net/iflib.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=7ba6ecf216fb15e8b147db268b91d9b82c5a0682 commit 7ba6ecf216fb15e8b147db268b91d9b82c5a0682 Author: Vincenzo Maffione <vmaffione@FreeBSD.org> AuthorDate: 2021-01-09 22:07:24 +0000 Commit: Vincenzo Maffione <vmaffione@FreeBSD.org> CommitDate: 2021-01-09 22:07:24 +0000 netmap: refactor netmap_reset The netmap_reset() function is meant to be called by the driver when they initialize (or re-initialize) a hardware ring. However, since the introduction of support for opening (in netmap mode) a subset of the available rings, netmap_reset() may be called multiple times on actively used rings, causing both kring and netmap ring to transition to an inconsistent state. This changes improves the situation by resetting all the indices fields of the kring to 0, as expected after the reinitialization of a hardware ring. PR: 252518 MFC after: 1 week sys/dev/netmap/netmap.c | 65 +++++++++++++++---------------------------------- 1 file changed, 20 insertions(+), 45 deletions(-) Thanks for reporting. Could you please retry with the current HEAD? (or maybe on 12.x after applying 1d238b07d5d4d9660ae0e08daede6da7e91c7853, 3189ba61673af5bd3ed2ee9e33be92bc0b9995ba and 7ba6ecf216fb15e8b147db268b91d9b82c5a0682)? Thank you for the commits this weekend! After applying the three patches above, I can verify that running pkt-gen with multiple queues no longer panics in iflib on exit. I tested with 2-16 netmap threads and observed successful processing and termination in each case. I believe this issue is now fixed. Thanks for the confirmation. MFC needed A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=9882b83132007acc0e6cef6d248fb12cdebba278 commit 9882b83132007acc0e6cef6d248fb12cdebba278 Author: Vincenzo Maffione <vmaffione@FreeBSD.org> AuthorDate: 2021-01-09 20:54:11 +0000 Commit: Vincenzo Maffione <vmaffione@FreeBSD.org> CommitDate: 2021-01-17 12:02:44 +0000 netmap: iflib: stop krings during interface reset When different processes open separate subsets of the available rings of a same netmap interface, a device reset may be performed while one of the processes is actively using some rings (e.g., caused by another process executing a nmport_open()). With this patch, such situation will cause the active process to get a POLLERR, so that it can have a chance to detect the situation. We also guarantee that no process is running a txsync or rxsync (ioctl or poll) while an iflib device reset is in progress. PR: 252453 MFC after: 1 week (cherry picked from commit 1d238b07d5d4d9660ae0e08daede6da7e91c7853) netmap: iflib: fix asserts in netmap_fl_refill() When netmap_fl_refill() is called at initialization time (e.g., during netmap_iflib_register()), nic_i must be 0, since the free list is reinitialized. At the end of the refill cycle, nic_i must still be zero, because exactly N descriptors (N is the ring size) are refilled. This patch therefore fixes the assertions to check on nic_i rather than on nm_i. The current netmap_reset() may in fact cause nm_i to be != 0 while the device is resetting: this may happen when multiple non-cooperating processes open different subsets of the available netmap rings. PR: 252518 MFC after: 1 week (cherry picked from commit 3189ba61673af5bd3ed2ee9e33be92bc0b9995ba) netmap: refactor netmap_reset The netmap_reset() function is meant to be called by the driver when they initialize (or re-initialize) a hardware ring. However, since the introduction of support for opening (in netmap mode) a subset of the available rings, netmap_reset() may be called multiple times on actively used rings, causing both kring and netmap ring to transition to an inconsistent state. This changes improves the situation by resetting all the indices fields of the kring to 0, as expected after the reinitialization of a hardware ring. PR: 252518 MFC after: 1 week (cherry picked from commit 7ba6ecf216fb15e8b147db268b91d9b82c5a0682) netmap: iflib: enable/disable krings on any interface reinit Since 1d238b07d5d4d9660ae0e0, krings are disabled before a reinit cycle triggered by iflib_netmap_register. However, this operation is actually necessary also for any interface reinit triggered by other causes (i.e., ifconfig commands). We achieve this goal by moving the krings enable/disable operation inside iflib_stop() and iflib_init_locked(). Once here, this change also removes some redundant operations from iflib_netmap_register(), that are already performed by iflib_stop(). PR: 252453 MFC after: 1 week (cherry picked from commit 3d65fd97e85ab807f3baa62aa979ae527914a3ea) iflib: fix build failure in case DEV_NETMAP is not defined This addresses the build failure introduced by 3d65fd97e85ab807f3baa62. MFC with: 3d65fd97e85ab807f3baa62 (cherry picked from commit 8aa8484cbf06bb26ad87177fe4aebcaf1519f138) netmap: restore hwofs and support it in iflib Restore the hwofs functionality temporarily disabled by 7ba6ecf216fb15e8b147db2 to prevent issues with iflib. This patch brings the necessary changes to iflib to enable howfs to allow interface restarts without disrupting netmap applications actively using its rings. After this change, it becomes possible for multiple non-cooperating netmap applications to use non-overlapping subsets of the available netmap rings without clashing with each other. PR: 252453 MFC after: 1 week (cherry picked from commit 55f0ad5fdee1a779d6889481ba591a819081b9ca) sys/dev/netmap/netmap.c | 94 +++++++++++++++++++++++-------------------------- sys/net/iflib.c | 67 +++++++++++++++++++++++++---------- 2 files changed, 94 insertions(+), 67 deletions(-) A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=9882b83132007acc0e6cef6d248fb12cdebba278 commit 9882b83132007acc0e6cef6d248fb12cdebba278 Author: Vincenzo Maffione <vmaffione@FreeBSD.org> AuthorDate: 2021-01-09 20:54:11 +0000 Commit: Vincenzo Maffione <vmaffione@FreeBSD.org> CommitDate: 2021-01-17 12:02:44 +0000 netmap: iflib: stop krings during interface reset When different processes open separate subsets of the available rings of a same netmap interface, a device reset may be performed while one of the processes is actively using some rings (e.g., caused by another process executing a nmport_open()). With this patch, such situation will cause the active process to get a POLLERR, so that it can have a chance to detect the situation. We also guarantee that no process is running a txsync or rxsync (ioctl or poll) while an iflib device reset is in progress. PR: 252453 MFC after: 1 week (cherry picked from commit 1d238b07d5d4d9660ae0e08daede6da7e91c7853) netmap: iflib: fix asserts in netmap_fl_refill() When netmap_fl_refill() is called at initialization time (e.g., during netmap_iflib_register()), nic_i must be 0, since the free list is reinitialized. At the end of the refill cycle, nic_i must still be zero, because exactly N descriptors (N is the ring size) are refilled. This patch therefore fixes the assertions to check on nic_i rather than on nm_i. The current netmap_reset() may in fact cause nm_i to be != 0 while the device is resetting: this may happen when multiple non-cooperating processes open different subsets of the available netmap rings. PR: 252518 MFC after: 1 week (cherry picked from commit 3189ba61673af5bd3ed2ee9e33be92bc0b9995ba) netmap: refactor netmap_reset The netmap_reset() function is meant to be called by the driver when they initialize (or re-initialize) a hardware ring. However, since the introduction of support for opening (in netmap mode) a subset of the available rings, netmap_reset() may be called multiple times on actively used rings, causing both kring and netmap ring to transition to an inconsistent state. This changes improves the situation by resetting all the indices fields of the kring to 0, as expected after the reinitialization of a hardware ring. PR: 252518 MFC after: 1 week (cherry picked from commit 7ba6ecf216fb15e8b147db268b91d9b82c5a0682) netmap: iflib: enable/disable krings on any interface reinit Since 1d238b07d5d4d9660ae0e0, krings are disabled before a reinit cycle triggered by iflib_netmap_register. However, this operation is actually necessary also for any interface reinit triggered by other causes (i.e., ifconfig commands). We achieve this goal by moving the krings enable/disable operation inside iflib_stop() and iflib_init_locked(). Once here, this change also removes some redundant operations from iflib_netmap_register(), that are already performed by iflib_stop(). PR: 252453 MFC after: 1 week (cherry picked from commit 3d65fd97e85ab807f3baa62aa979ae527914a3ea) iflib: fix build failure in case DEV_NETMAP is not defined This addresses the build failure introduced by 3d65fd97e85ab807f3baa62. MFC with: 3d65fd97e85ab807f3baa62 (cherry picked from commit 8aa8484cbf06bb26ad87177fe4aebcaf1519f138) netmap: restore hwofs and support it in iflib Restore the hwofs functionality temporarily disabled by 7ba6ecf216fb15e8b147db2 to prevent issues with iflib. This patch brings the necessary changes to iflib to enable howfs to allow interface restarts without disrupting netmap applications actively using its rings. After this change, it becomes possible for multiple non-cooperating netmap applications to use non-overlapping subsets of the available netmap rings without clashing with each other. PR: 252453 MFC after: 1 week (cherry picked from commit 55f0ad5fdee1a779d6889481ba591a819081b9ca) sys/dev/netmap/netmap.c | 94 +++++++++++++++++++++++-------------------------- sys/net/iflib.c | 67 +++++++++++++++++++++++++---------- 2 files changed, 94 insertions(+), 67 deletions(-) |