Bug 275920 - Kernel crash in sys/netlink/route/iface.c:124
Summary: Kernel crash in sys/netlink/route/iface.c:124
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 14.0-RELEASE
Hardware: arm64 Any
: --- Affects Some People
Assignee: freebsd-net (Nobody)
URL:
Keywords: crash
Depends on:
Blocks:
 
Reported: 2023-12-25 04:07 UTC by Mike Cui
Modified: 2024-05-17 17:25 UTC (History)
7 users (show)

See Also:


Attachments
kernel crashdump (55.83 KB, text/plain)
2023-12-25 04:07 UTC, Mike Cui
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Cui 2023-12-25 04:07:14 UTC
Created attachment 247235 [details]
kernel crashdump

I have an espressobin v7 which was running fine on 13.2-RELEASE. However, after upgrading to 14.0-RELEASE, the kernel crashes with vm_fault immediately after loading the e6000sw module.

Attached is the full kernel crash dump core.txt file.
Comment 1 Mike Cui 2023-12-25 04:26:43 UTC
Presumably the crash happened sys/netlink/route/iface.c:124, in function get_operstate_ether() which calls if_ioctl(), and ifp->if_ioctl is NULL for some reason.

It also appears that NETLINK subsystem was added in 13.2, but NETLINK is not included in the base kernel in 13.2, but it is in 14.0. Is NETLINK required in 14.0? Can just compile out of the kernel myself?
Comment 2 Kristof Provost freebsd_committer freebsd_triage 2023-12-25 09:44:27 UTC
That's interesting. The cause is fairly obvious. e6000sw creates struct ifnets that don't have an ioctl handler, and that's triggering this crash.

The following is probably the best fix, assuming that the ioctl-less e6000sw ifnet is intentional:

diff --git a/sys/net/if.c b/sys/net/if.c
index 9f44223af0dd..c3c27fbf678f 100644
--- a/sys/net/if.c
+++ b/sys/net/if.c
@@ -4871,6 +4871,9 @@ if_resolvemulti(if_t ifp, struct sockaddr **srcs, struct sockaddr *dst)
 int
 if_ioctl(if_t ifp, u_long cmd, void *data)
 {
+       if (ifp->if_ioctl == NULL)
+               return (EOPNOTSUPP);
+
        return (ifp->if_ioctl(ifp, cmd, data));
 }
Comment 3 Gleb Smirnoff freebsd_committer freebsd_triage 2023-12-25 16:54:03 UTC
On 2023-12-25 09:44:27 UTC, kp@freebsd.org wrote:
> That's interesting. The cause is fairly obvious. e6000sw creates struct ifnets
> that don't have an ioctl handler, and that's triggering this crash.

Well, without Netlink, there were always ways to call if_ioctl() and
that would crash on stable/13.

Mike, can you please provide full trace of the crash? Can you please point
me to the sources of the e6000sw?
Comment 4 Kristof Provost freebsd_committer freebsd_triage 2023-12-25 17:18:12 UTC
That code lives in sys/dev/etherswitch/e6000sw/e6000sw.c

It creates a struct ifnet for each port in e6000sw_attach() / e6000sw_init_interface(). It never actually attached that ifnet though. I believe it's only created so e6000sw can call into the mii code, which is also how I think we eventually end up in the panicing stack. There's a link state event, which calls do_link_state_change() -> rtnl_handle_ifevent() -> dump_iface() -> get_operstate() -> get_operstate_ether(). That wants to know if the link is up or down, so it tries to ioctl(SIOCGIFMEDIA). Which doesn't go well if if_ioctl is NULL.

Here's the relevant bit of backtrace: 

#7  0x0000000000000000 in ?? ()
#8  0xffff0000006f87f4 in get_operstate_ether (ifp=0xffffa00002f7d000, 
    pstate=<optimized out>) at /usr/src/sys/netlink/route/iface.c:124
#9  get_operstate (ifp=0xffffa00002f7d000, pstate=<optimized out>)
    at /usr/src/sys/netlink/route/iface.c:181
#10 dump_iface (nw=nw@entry=0xffff0000877e0780, 
    ifp=ifp@entry=0xffffa00002f7d000, hdr=hdr@entry=0xffff0000877e07c0, 
    if_flags_mask=if_flags_mask@entry=0)
    at /usr/src/sys/netlink/route/iface.c:310
#11 0xffff0000006f80cc in rtnl_handle_ifevent (ifp=0xffffa00002f7d000, 
    nlmsg_type=<optimized out>, if_flags_mask=0)
    at /usr/src/sys/netlink/route/iface.c:1411
#12 0xffff0000005f9cb8 in do_link_state_change (arg=0xffffa00002f7d000, 
    pending=1) at /usr/src/sys/net/if.c:2181
#13 0xffff000000525bf0 in taskqueue_run_locked (
    queue=queue@entry=0xffffa0000136d300)
    at /usr/src/sys/kern/subr_taskqueue.c:512
#14 0xffff00000052594c in taskqueue_run (queue=0xffffa0000136d300)
    at /usr/src/sys/kern/subr_taskqueue.c:527
Comment 5 Mike Cui 2024-05-05 02:58:16 UTC
Any chance we can get this fixed for 14.1?

It would be nice if 14.1 worked out of the box without having to cross compile my own custom kernel.
Comment 6 commit-hook freebsd_committer freebsd_triage 2024-05-06 13:25:27 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=43387b4e574043b78a58c8bcb7575161b055fce1

commit 43387b4e574043b78a58c8bcb7575161b055fce1
Author:     Kristof Provost <kp@FreeBSD.org>
AuthorDate: 2024-05-06 09:39:08 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2024-05-06 09:39:08 +0000

    if: guard against if_ioctl being NULL

    There are situations where an struct ifnet has a NULL if_ioctl pointer.

    For example, e6000sw creates such struct ifnets for each of its ports so it can
    call into the MII code.

    If there is then a link state event this calls do_link_state_change()
    -> rtnl_handle_ifevent() -> dump_iface() -> get_operstate() ->
    get_operstate_ether(). That wants to know if the link is up or down, so it tries
    to ioctl(SIOCGIFMEDIA), which doesn't go well if if_ioctl is NULL.

    Guard against this, and return EOPNOTSUPP.

    PR:             275920
    MFC ater:       3 days
    Sponsored by:   Rubicon Communications, LLC ("Netgate")

 sys/net/if.c | 3 +++
 1 file changed, 3 insertions(+)
Comment 7 commit-hook freebsd_committer freebsd_triage 2024-05-12 16:13:15 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=9a8a26aefb366ef6f497d48547a1562a1de566c1

commit 9a8a26aefb366ef6f497d48547a1562a1de566c1
Author:     Kristof Provost <kp@FreeBSD.org>
AuthorDate: 2024-05-06 09:39:08 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2024-05-12 16:12:04 +0000

    if: guard against if_ioctl being NULL

    There are situations where an struct ifnet has a NULL if_ioctl pointer.

    For example, e6000sw creates such struct ifnets for each of its ports so it can
    call into the MII code.

    If there is then a link state event this calls do_link_state_change()
    -> rtnl_handle_ifevent() -> dump_iface() -> get_operstate() ->
    get_operstate_ether(). That wants to know if the link is up or down, so it tries
    to ioctl(SIOCGIFMEDIA), which doesn't go well if if_ioctl is NULL.

    Guard against this, and return EOPNOTSUPP.

    PR:             275920
    MFC ater:       3 days
    Sponsored by:   Rubicon Communications, LLC ("Netgate")

    (cherry picked from commit 43387b4e574043b78a58c8bcb7575161b055fce1)

 sys/net/if.c | 3 +++
 1 file changed, 3 insertions(+)
Comment 8 Mike Cui 2024-05-17 17:25:45 UTC
Gentle reminder: can this be cherry-picked to releng/14.1?