| Summary: | [panic] NULL pointer dereference in nd6_llinfo_timer() | ||
|---|---|---|---|
| Product: | Base System | Reporter: | Andrey V. Elsukov <ae> |
| Component: | kern | Assignee: | Andrey V. Elsukov <ae> |
| Status: | Closed FIXED | ||
| Severity: | Affects Only Me | CC: | hselasky, markj, net, rgrimes, vegeta |
| Priority: | --- | Keywords: | crash |
| Version: | CURRENT | Flags: | koobs:
mfc-stable11+
|
| Hardware: | Any | ||
| OS: | Any | ||
| See Also: | https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209682 | ||
|
Description
Andrey V. Elsukov
2018-02-15 17:09:16 UTC
Add lle_tbl
(kgdb) p *ln->lle_tbl
$4 = {llt_link = {sle_next = 0x2}, llt_af = 33770267, llt_hsize = 0, lle_head = 0x0, llt_ifp = 0x0, llt_lookup = 0x0, llt_alloc_entry = 0x0,
llt_delete_entry = 0x8013070200010001, llt_prefix_free = 0x0, llt_dump_entry = 0x0, llt_hash = 0x0, llt_match_prefix = 0xc4f59, llt_free_entry = 0x1,
llt_foreach_entry = 0x48c0a4d8d9, llt_link_entry = 0x290e5e84a44f, llt_unlink_entry = 0xdf61b3ed0ad2e, llt_fill_sa_entry = 0x388ec15688a8057, llt_free_tbl = 0x0}
Is this related to: https://reviews.freebsd.org/D4605 --HPS Can you try the patch in D4605 ? (In reply to Hans Petter Selasky from comment #3) > Can you try the patch in D4605 ? This problem is not easy to trigger and I'm not sure that the patch solves it. I think how it happens: on ifnet departure in6_domifdetach() is called, then lltable_free() unlinks lltable and acquires LLE_WLOCK for each llentry. While we are acquiring all these locks it is possible, that callout for some entry has been started and blocked on the LLE_WLOCK() since lock is held by lltable_free(). Then llentry_free() releases LLE_WLOCK(), and nd6_llinfo_timer() acquires it. Now we have invalid ifp pointer, it is strange enough that it is NULL... About your patch, it seems the code: if (callout_stop(&lle->lle_timer) > 0) LLE_REMREF(lle); can be placed in the llentry_free() only once, and also removed from lltable_free(). But it doesn't protect from described scenario. I think we need somehow handle the case when nd6_llinfo_timer() is already active, but it hasn't obtained lock yet. There is a generic problem in the LLE code (IPv4 and IPv6) that it cannot sleep to drain the timers. In case an active timer is detected after callout_stop(), then the LL entry should be put on a taskqueue and callout_drain() _must_ be used for sane operation. I agree about your suggestion to move callout stop to a central place. For now I'll try this patch with your D4605.
@@ -780,8 +778,13 @@ nd6_llinfo_timer(void *arg)
CURVNET_RESTORE();
return;
}
- ndi = ND_IFINFO(ifp);
send_ns = 0;
+ /*
+ * Check that entry is still linked to LLE table.
+ */
+ if ((ln->la_flags & LLE_LINKED) == 0)
+ goto done;
+ ndi = ND_IFINFO(ifp);
dst = &ln->r_l3addr.addr6;
pdst = dst;
Is there any progress on this issue? Does the proposed patch really help? I had a similar stacktrace (no debug information though, sorry) when destroying a VLAN interface running some badly misconfigured (See bug 229384) carp IP addresses. (In reply to Kajetan Staszkiewicz from comment #8) > Is there any progress on this issue? Does the proposed patch really help? I > had a similar stacktrace (no debug information though, sorry) when > destroying a VLAN interface running some badly misconfigured (See bug > 229384) carp IP addresses. We have not seen such panic since then. But there were many changes regarding to locking in the network stack in CURRENT, and probably this patch now needs some changes. I'll take a look at this problem. A commit references this bug: Author: ae Date: Tue Jul 17 11:33:24 UTC 2018 New revision: 336405 URL: https://svnweb.freebsd.org/changeset/base/336405 Log: Move invoking of callout_stop(&lle->lle_timer) into llentry_free(). This deduplicates the code a bit, and also implicitly adds missing callout_stop() to in[6]_lltable_delete_entry() functions. PR: 209682, 225927 Submitted by: hselasky (previous version) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D4605 Changes: head/sys/net/if_llatbl.c head/sys/netinet/in.c head/sys/netinet6/in6.c Assign to committer resolving A commit references this bug: Author: ae Date: Wed Aug 8 16:09:29 UTC 2018 New revision: 337460 URL: https://svnweb.freebsd.org/changeset/base/337460 Log: MFC r336405: Move invoking of callout_stop(&lle->lle_timer) into llentry_free(). This deduplicates the code a bit, and also implicitly adds missing callout_stop() to in[6]_lltable_delete_entry() functions. PR: 209682, 225927 Changes: _U stable/11/ stable/11/sys/net/if_llatbl.c stable/11/sys/netinet/in.c stable/11/sys/netinet6/in6.c |