Summary: | panic: Assertion in_epoch(net_epoch_preempt) failed at /usr/src/sys/net/if_vlan.c:1185 | ||
---|---|---|---|
Product: | Base System | Reporter: | Niels Bakker <niels=freebsd> |
Component: | kern | Assignee: | Aleksandr Fedorov <afedorov> |
Status: | Closed FIXED | ||
Severity: | Affects Only Me | CC: | afedorov, donner, glebius, markj, net |
Priority: | --- | Keywords: | crash |
Version: | 13.0-STABLE | Flags: | koobs:
maintainer-feedback+
koobs: mfc-stable13- |
Hardware: | arm64 | ||
OS: | Any | ||
URL: | https://reviews.freebsd.org/D34185 | ||
See Also: |
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=254695 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=248958 https://reviews.freebsd.org/D34185 |
Description
Niels Bakker
2021-07-09 00:15:41 UTC
I think this is a bug in the ng_ppoe(4) code. Disconnecting hooks are called outside of NET_EPOCH, but ng_pppoe_disconnect() calls NG_SEND_DATA_ONLY() which should be called in NET_EPOCH. Please try the following patch: diff --git a/sys/netgraph/ng_pppoe.c b/sys/netgraph/ng_pppoe.c index 295a136cc55..e07f77b9d54 100644 --- a/sys/netgraph/ng_pppoe.c +++ b/sys/netgraph/ng_pppoe.c @@ -2037,6 +2037,7 @@ ng_pppoe_disconnect(hook_p hook) log(LOG_NOTICE, "ng_pppoe[%x]: session out of " "mbufs\n", node->nd_ID); else { + struct epoch_tracker et; struct pppoe_full_hdr *wh; struct pppoe_tag *tag; int msglen = strlen(SIGNOFF); @@ -2067,8 +2068,11 @@ ng_pppoe_disconnect(hook_p hook) m->m_pkthdr.len = m->m_len = sizeof(*wh) + sizeof(*tag) + msglen; wh->ph.length = htons(sizeof(*tag) + msglen); + + NET_EPOCH_ENTER(et); NG_SEND_DATA_ONLY(error, privp->ethernet_hook, m); + NET_EPOCH_EXIT(et); } } if (sp->state == PPPOE_LISTENING) Test methodology: Unplug cable carrying the PPPoE session from the NTU, plug into a switch so link stays up, wait. Observed behaviour pre patch: same panic as ng0 gets torn down Observed behaviour with above patch applied: normal operation continues after mpd tears down the link, and the PPPoE link comes back up as soon as the cable is plugged back into the NTU. Success! Thanks, Alexandr Fedorov! Apologies for the typo, Aleksandr (In reply to Aleksandr Fedorov from comment #1) There are several other places where ng_pppoe sends data on the ethernet hook without entering an epoch section - I think they all need to be updated? (In reply to Mark Johnston from comment #4) I think the previous fix closed all similar problems: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=248958 and https://reviews.freebsd.org/D26226 I have not yet found other problem areas in the ng_pppoe(4) code. The main idea behind introducing NET_EPOCH in netgraph (4) is: - Data processing (NG_SEND_DATA_ *) should always be in the NET_EPOCH section. - Processing of service messages (NG_SEND_MSG_ *) should not be in the NET_EPOCH section. Because message handling (* _rcvmsg, connect/disconnect hooks, node constructor/destructor) allows sleeping and other forbidden operations for NET_EPOCH section. But on many nodes, some data processing routines generate messages, and message processing routines generate data (as in this PR). Unfortunately, there are cases that are not so easy to fix without reworking ng_base.c. (In reply to Aleksandr Fedorov from comment #5) Is this the same issue as bug 248958? If so, I note the changed weren't tagged for MFC. Should those changes be merged to stable/* branches? If so, please re-open bug 248958, set mfc-stable{13,12} flags to "?" there, and close this issue as a duplicate of it Thanks It's not the same bug. It manifests in a completely different file. At most it's the same class of bug introduced with epoch(9). Same file but a different line, apologies again A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=b27e6e91d0ad1f87b296f7583d4f5d938d7a997c commit b27e6e91d0ad1f87b296f7583d4f5d938d7a997c Author: Aleksandr Fedorov <afedorov@FreeBSD.org> AuthorDate: 2022-02-09 19:00:50 +0000 Commit: Aleksandr Fedorov <afedorov@FreeBSD.org> CommitDate: 2022-02-09 19:00:50 +0000 ng pppoe(4): Add the required NET_EPOCH section to the hook disconnection function. Disconnecting hooks are called outside of NET_EPOCH, but ng_pppoe_disconnect() calls NG_SEND_DATA_ONLY() which should be called in NET_EPOCH. PR: 257067 Reported by: niels=freebsd@bakker.net Reviewed by: vmaffione (mentor), glebius, donner Approved by: vmaffione (mentor), glebius, donner Sponsored by: vstack.com Differential Revision: https://reviews.freebsd.org/D34185 sys/netgraph/ng_pppoe.c | 4 ++++ 1 file changed, 4 insertions(+) A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=85cd9f7e989bab02aee32a5f25d517567a0cc928 commit 85cd9f7e989bab02aee32a5f25d517567a0cc928 Author: Aleksandr Fedorov <afedorov@FreeBSD.org> AuthorDate: 2022-02-09 19:00:50 +0000 Commit: Aleksandr Fedorov <afedorov@FreeBSD.org> CommitDate: 2022-02-13 12:05:45 +0000 ng pppoe(4): Add the required NET_EPOCH section to the hook disconnection function. Disconnecting hooks are called outside of NET_EPOCH, but ng_pppoe_disconnect() calls NG_SEND_DATA_ONLY() which should be called in NET_EPOCH. PR: 257067 Reported by: niels=freebsd@bakker.net Reviewed by: vmaffione (mentor), glebius, donner Approved by: vmaffione (mentor), glebius, donner Sponsored by: vstack.com Differential Revision: https://reviews.freebsd.org/D34185 (cherry picked from commit b27e6e91d0ad1f87b296f7583d4f5d938d7a997c) sys/netgraph/ng_pppoe.c | 4 ++++ 1 file changed, 4 insertions(+) stable/12 doesn't support NET_EPOCH, so there is no need to do MFC. |