FreeBSD-head-amd64-test job starts randomly failing after: https://ci.freebsd.org/job/FreeBSD-head-amd64-test/14511/ and following builds: https://ci.freebsd.org/job/FreeBSD-head-amd64-test/14512/ https://ci.freebsd.org/job/FreeBSD-head-amd64-test/14515/ https://ci.freebsd.org/job/FreeBSD-head-amd64-test/14516/ https://ci.freebsd.org/job/FreeBSD-head-amd64-test/14517/ ... https://ci.freebsd.org/job/FreeBSD-head-amd64-test/14528/ Console log: sys/netpfil/pf/nat:exhaust -> panic: epair_qflush: ifp=0xfffff800ae6d5800, epair_softc gone? sc=0 cpuid = 0 time = 1583444839 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe002c6fe880 vpanic() at vpanic+0x185/frame 0xfffffe002c6fe8e0 panic() at panic+0x43/frame 0xfffffe002c6fe940 epair_qflush() at epair_qflush+0x1a8/frame 0xfffffe002c6fe990 if_down() at if_down+0x12d/frame 0xfffffe002c6fe9c0 if_detach_internal() at if_detach_internal+0x2de/frame 0xfffffe002c6fea20 if_vmove() at if_vmove+0x3c/frame 0xfffffe002c6fea70 vnet_if_return() at vnet_if_return+0x50/frame 0xfffffe002c6fea90 vnet_destroy() at vnet_destroy+0x130/frame 0xfffffe002c6feac0 prison_deref() at prison_deref+0x29d/frame 0xfffffe002c6feb00 taskqueue_run_locked() at taskqueue_run_locked+0xaa/frame 0xfffffe002c6feb80 taskqueue_thread_loop() at taskqueue_thread_loop+0xc2/frame 0xfffffe002c6febb0 fork_exit() at fork_exit+0x80/frame 0xfffffe002c6febf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe002c6febf0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- KDB: enter: panic [ thread pid 0 tid 100011 ] Stopped at kdb_enter+0x37: movq $0,0x10874b6(%rip) db:0:kdb.enter.panic> show pcpu cpuid = 0 dynamic pcpu = 0x788d80 curthread = 0xfffffe0008688c00: pid 0 tid 100011 critnest 1 "thread taskq" curpcb = 0xfffffe0008689110 fpcurthread = none idlethread = 0xfffffe000a1f6300: tid 100003 "idle: cpu0" self = 0xffffffff82210000 curpmap = 0xffffffff81d9ea50 tssp = 0xffffffff82210384 rsp0 = 0xfffffe002c6fecc0 kcr3 = 0x8000000002119002 ucr3 = 0xffffffffffffffff scr3 = 0x1012c8f99 gs32p = 0xffffffff82210404 ldt = 0xffffffff82210444 tss = 0xffffffff82210434 tlb gen = 307136 curvnet = 0xfffff80003438380 spin locks held: db:0:kdb.enter.panic> It can be reproduced by running `kyua test sys.netpfil.pf.nat.exhaust` in a loop in the VM: https://artifact.ci.freebsd.org/snapshot/head/r358683/amd64/amd64/disk-test.img.xz
A commit references this bug: Author: lwhsu Date: Tue Mar 10 19:18:25 UTC 2020 New revision: 358852 URL: https://svnweb.freebsd.org/changeset/base/358852 Log: Skip sys.netpfil.pf.nat.exhaust on amd64 in CI as it sometimes panics kernel PR: 244703 Sponsored by: The FreeBSD Foundation Changes: head/tests/sys/netpfil/pf/nat.sh
This looks like it's the same issue as #238870
A commit references this bug: Author: lwhsu Date: Thu Mar 12 19:10:54 UTC 2020 New revision: 358918 URL: https://svnweb.freebsd.org/changeset/base/358918 Log: MFC r358852: Skip sys.netpfil.pf.nat.exhaust on amd64 in CI as it sometimes panics kernel PR: 244703 Sponsored by: The FreeBSD Foundation Changes: _U stable/12/ stable/12/tests/sys/netpfil/pf/nat.sh
A commit references this bug: Author: lwhsu Date: Fri Mar 13 16:44:48 UTC 2020 New revision: 358961 URL: https://svnweb.freebsd.org/changeset/base/358961 Log: Skip sys.netpfil.pf.nat.exhaust on all platforms as it not only fails on amd64 PR: 244703 Sponsored by: The FreeBSD Foundation Changes: head/tests/sys/netpfil/pf/nat.sh
A commit references this bug: Author: lwhsu Date: Fri Mar 13 17:10:53 UTC 2020 New revision: 358964 URL: https://svnweb.freebsd.org/changeset/base/358964 Log: MFC r358961: Skip sys.netpfil.pf.nat.exhaust on all platforms as it not only fails on amd64 PR: 244703 Sponsored by: The FreeBSD Foundation Changes: _U stable/12/ stable/12/tests/sys/netpfil/pf/nat.sh
A commit references this bug: Author: lwhsu Date: Mon Apr 20 14:18:56 UTC 2020 New revision: 360120 URL: https://svnweb.freebsd.org/changeset/base/360120 Log: Temporarily disable sys.netinet.divert.* on i386 PR: 244703 Sponsored by: The FreeBSD Foundation Changes: head/tests/sys/netinet/divert.sh
A commit references this bug: Author: kp Date: Tue Sep 8 14:54:11 UTC 2020 New revision: 365457 URL: https://svnweb.freebsd.org/changeset/base/365457 Log: net: mitigate vnet / epair cleanup races There's a race where dying vnets move their interfaces back to their original vnet, and if_epair cleanup (where deleting one interface also deletes the other end of the epair). This is commonly triggered by the pf tests, but also by cleanup of vnet jails. As we've not yet been able to fix the root cause of the issue work around the panic by not dereferencing a NULL softc in epair_qflush() and by not re-attaching DYING interfaces. This isn't a full fix, but makes a very common panic far less likely. PR: 244703, 238870 Reviewed by: lutz_donnerhacke.de MFC after: 4 days Differential Revision: https://reviews.freebsd.org/D26324 Changes: head/sys/net/if.c head/sys/net/if_epair.c
A commit references this bug: Author: kp Date: Sat Sep 12 12:45:32 UTC 2020 New revision: 365659 URL: https://svnweb.freebsd.org/changeset/base/365659 Log: MFC r365457: net: mitigate vnet / epair cleanup races There's a race where dying vnets move their interfaces back to their original vnet, and if_epair cleanup (where deleting one interface also deletes the other end of the epair). This is commonly triggered by the pf tests, but also by cleanup of vnet jails. As we've not yet been able to fix the root cause of the issue work around the panic by not dereferencing a NULL softc in epair_qflush() and by not re-attaching DYING interfaces. This isn't a full fix, but makes a very common panic far less likely. PR: 244703, 238870 Changes: _U stable/12/ stable/12/sys/net/if.c stable/12/sys/net/if_epair.c
A commit references this bug: Author: kp Date: Sat Sep 12 18:58:36 UTC 2020 New revision: 365669 URL: https://svnweb.freebsd.org/changeset/base/365669 Log: MFC r365457: net: mitigate vnet / epair cleanup races There's a race where dying vnets move their interfaces back to their original vnet, and if_epair cleanup (where deleting one interface also deletes the other end of the epair). This is commonly triggered by the pf tests, but also by cleanup of vnet jails. As we've not yet been able to fix the root cause of the issue work around the panic by not dereferencing a NULL softc in epair_qflush() and by not re-attaching DYING interfaces. This isn't a full fix, but makes a very common panic far less likely. PR: 244703, 238870 Approved by: re (gjb) Changes: _U releng/12.2/ releng/12.2/sys/net/if.c releng/12.2/sys/net/if_epair.c
A commit references this bug: Author: kp Date: Tue Dec 1 16:24:00 UTC 2020 New revision: 368237 URL: https://svnweb.freebsd.org/changeset/base/368237 Log: if: Fix panic when destroying vnet and epair simultaneously When destroying a vnet and an epair (with one end in the vnet) we often panicked. This was the result of the destruction of the epair, which destroys both ends simultaneously, happening while vnet_if_return() was moving the struct ifnet to its home vnet. This can result in a freed ifnet being re-added to the home vnet V_ifnet list. That in turn panics the next time the ifnet is used. Prevent this race by ensuring that vnet_if_return() cannot run at the same time as if_detach() or epair_clone_destroy(). PR: 238870, 234985, 244703, 250870 MFC after: 2 weeks Sponsored by: Modirum MDPay Differential Revision: https://reviews.freebsd.org/D27378 Changes: head/sys/net/if.c
A commit references this bug: Author: kp Date: Tue Dec 15 15:33:29 UTC 2020 New revision: 368663 URL: https://svnweb.freebsd.org/changeset/base/368663 Log: MFC r368237: if: Fix panic when destroying vnet and epair simultaneously When destroying a vnet and an epair (with one end in the vnet) we often panicked. This was the result of the destruction of the epair, which destroys both ends simultaneously, happening while vnet_if_return() was moving the struct ifnet to its home vnet. This can result in a freed ifnet being re-added to the home vnet V_ifnet list. That in turn panics the next time the ifnet is used. Prevent this race by ensuring that vnet_if_return() cannot run at the same time as if_detach() or epair_clone_destroy(). PR: 238870, 234985, 244703, 250870 Sponsored by: Modirum MDPay Changes: _U stable/12/ stable/12/sys/net/if.c
A commit in branch releng/12.1 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=e0c15f45abd4bd5165e11b557a8c90d0faf5cfeb commit e0c15f45abd4bd5165e11b557a8c90d0faf5cfeb Author: Kristof Provost <kp@FreeBSD.org> AuthorDate: 2021-01-18 21:55:53 +0000 Commit: Ed Maste <emaste@FreeBSD.org> CommitDate: 2021-01-29 00:58:55 +0000 MFC r368237: if: Fix panic when destroying vnet and epair simultaneously When destroying a vnet and an epair (with one end in the vnet) we often panicked. This was the result of the destruction of the epair, which destroys both ends simultaneously, happening while vnet_if_return() was moving the struct ifnet to its home vnet. This can result in a freed ifnet being re-added to the home vnet V_ifnet list. That in turn panics the next time the ifnet is used. Prevent this race by ensuring that vnet_if_return() cannot run at the same time as if_detach() or epair_clone_destroy(). PR: 238870, 234985, 244703, 250870 Sponsored by: Modirum MDPay Approved by: so sys/net/if.c | 147 +++++++++++++++++++++++++++++++++++++------------------ sys/net/if_var.h | 24 ++------- 2 files changed, 104 insertions(+), 67 deletions(-)
A commit in branch releng/12.2 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=e682b62c96e94c60d830e4414215032e0d4f8dad commit e682b62c96e94c60d830e4414215032e0d4f8dad Author: Kristof Provost <kp@FreeBSD.org> AuthorDate: 2020-09-12 16:33:05 +0000 Commit: Ed Maste <emaste@FreeBSD.org> CommitDate: 2021-01-29 01:14:24 +0000 MFC r368237: if: Fix panic when destroying vnet and epair simultaneously When destroying a vnet and an epair (with one end in the vnet) we often panicked. This was the result of the destruction of the epair, which destroys both ends simultaneously, happening while vnet_if_return() was moving the struct ifnet to its home vnet. This can result in a freed ifnet being re-added to the home vnet V_ifnet list. That in turn panics the next time the ifnet is used. Prevent this race by ensuring that vnet_if_return() cannot run at the same time as if_detach() or epair_clone_destroy(). PR: 238870, 234985, 244703, 250870 Sponsored by: Modirum MDPay Approved by: so sys/net/if.c | 147 +++++++++++++++++++++++++++++++++++++------------------ sys/net/if_var.h | 24 ++------- 2 files changed, 104 insertions(+), 67 deletions(-)