Bug 274536 - panic: rt_tables_get_rnh_ptr: fam out of bounds (255 < 45)
Summary: panic: rt_tables_get_rnh_ptr: fam out of bounds (255 < 45)
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 15.0-CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: Gleb Smirnoff
URL:
Keywords: crash, needs-qa
Depends on:
Blocks: 247219
  Show dependency treegraph
 
Reported: 2023-10-17 16:37 UTC by Edward Tomasz Napierala
Modified: 2024-03-29 23:51 UTC (History)
5 users (show)

See Also:


Attachments
check if it is just a typo (487 bytes, application/octet-stream)
2024-03-15 18:30 UTC, Gleb Smirnoff
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Edward Tomasz Napierala freebsd_committer freebsd_triage 2023-10-17 16:37:30 UTC
Trying to run Linux Firefox binary from Ubuntu Focal seems to trigger panic on amd64 FreeBSD 15:

__curthread ()
    at /usr/home/trasz/git/freebsd-src/sys/amd64/include/pcpu_aux.h:57
57		__asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  __curthread ()
    at /usr/home/trasz/git/freebsd-src/sys/amd64/include/pcpu_aux.h:57
#1  doadump (textdump=textdump@entry=1)
    at /usr/home/trasz/git/freebsd-src/sys/kern/kern_shutdown.c:405
#2  0xffffffff80b4ee90 in kern_reboot (howto=260)
    at /usr/home/trasz/git/freebsd-src/sys/kern/kern_shutdown.c:526
#3  0xffffffff80b4f38f in vpanic (
    fmt=0xffffffff811406d9 "%s: fam out of bounds (%d < %d)", 
    ap=ap@entry=0xfffffe00fc3a2b30)
    at /usr/home/trasz/git/freebsd-src/sys/kern/kern_shutdown.c:970
#4  0xffffffff80b4f133 in panic (fmt=<unavailable>)
    at /usr/home/trasz/git/freebsd-src/sys/kern/kern_shutdown.c:894
#5  0xffffffff80cbbee3 in rt_tables_get_rnh_ptr (table=<optimized out>, 
    family=<optimized out>)
    at /usr/home/trasz/git/freebsd-src/sys/net/route/route_tables.c:372
#6  rt_tables_get_rnh (table=<optimized out>, family=<optimized out>)
    at /usr/home/trasz/git/freebsd-src/sys/net/route/route_tables.c:387
#7  0xffffffff80dc626d in dump_rtable_fib (wa=0xfffffe00fc3a2bc8, fibnum=0, 
    family=255) at /usr/home/trasz/git/freebsd-src/sys/netlink/route/rt.c:599
#8  handle_rtm_dump (nlp=0xfffff803cbde4700, fibnum=0, family=255, 
    hdr=0xfffff8039edb6800, nw=<unavailable>)
    at /usr/home/trasz/git/freebsd-src/sys/netlink/route/rt.c:682
#9  rtnl_handle_getroute (hdr=0xfffff8039edb6800, nlp=0xfffff803cbde4700, 
    npt=0xfffffe00fc3a2dc0)
    at /usr/home/trasz/git/freebsd-src/sys/netlink/route/rt.c:1028
#10 0xffffffff80dbe552 in rtnl_handle_message (hdr=0xfffff8039edb6800, 
    npt=0xfffffe00fc3a2dc0)
    at /usr/home/trasz/git/freebsd-src/sys/netlink/netlink_route.c:104
#11 0xffffffff80dbbeaa in nl_receive_message (hdr=0xfffff8039edb6800, 
    remaining_length=<optimized out>, nlp=0xfffff803cbde4700, 
    npt=0xfffffe00fc3a2dc0)
    at /usr/home/trasz/git/freebsd-src/sys/netlink/netlink_io.c:506
#12 nl_process_mbuf (m=0xfffff800099fd000, nlp=0xfffff803cbde4700)
    at /usr/home/trasz/git/freebsd-src/sys/netlink/netlink_io.c:580
#13 nl_process_received_one (nlp=0xfffff803cbde4700)
    at /usr/home/trasz/git/freebsd-src/sys/netlink/netlink_io.c:293
#14 nl_process_received (nlp=0xfffff803cbde4700)
    at /usr/home/trasz/git/freebsd-src/sys/netlink/netlink_io.c:320
#15 nl_taskqueue_handler (_arg=0xfffff803cbde4700, pending=<optimized out>)
    at /usr/home/trasz/git/freebsd-src/sys/netlink/netlink_io.c:371
#16 0xffffffff80bb497b in taskqueue_run_locked (
    queue=queue@entry=0xfffff8002400cc00)
    at /usr/home/trasz/git/freebsd-src/sys/kern/subr_taskqueue.c:512
#17 0xffffffff80bb5a33 in taskqueue_thread_loop (
    arg=arg@entry=0xfffff803cbde4760)
    at /usr/home/trasz/git/freebsd-src/sys/kern/subr_taskqueue.c:824
#18 0xffffffff80b04f02 in fork_exit (
    callout=0xffffffff80bb5960 <taskqueue_thread_loop>, 
    arg=0xfffff803cbde4760, frame=0xfffffe00fc3a2f40)
    at /usr/home/trasz/git/freebsd-src/sys/kern/kern_fork.c:1160
#19 <signal handler called>
#20 0x00000008011306c6 in ?? ()
Comment 1 Graham Perrin 2023-10-17 20:50:31 UTC
(In reply to Edward Tomasz Napierala from comment #0)

> amd64 FreeBSD 15:

Which version, exactly? Reproducible with an updated OS? 

(Reading this alongside bug 274538 comment 1.)
Comment 2 Edward Tomasz Napierala freebsd_committer freebsd_triage 2023-10-18 10:45:58 UTC
Ah; sorry; I've just verified it's still happening with:

FreeBSD pustak 15.0-CURRENT FreeBSD 15.0-CURRENT #69 main-n266018-d2abbfede534-dirty: Wed Oct 18 11:33:02 BST 2023     root@pustak:/usr/obj/usr/home/trasz/git/freebsd-src/amd64.amd64/sys/GENERIC amd64
Comment 3 John Baldwin freebsd_committer freebsd_triage 2023-10-20 20:13:18 UTC
I've cc'd melifaro@ as this looks to be related to netlink reading the routing tables.
Comment 4 Edward Tomasz Napierala freebsd_committer freebsd_triage 2024-03-14 13:42:58 UTC
This is still happening with yesterday's CURRENT.

It looks like the bug is caused by rtnl_handle_getroute() being called with family=255, which then causes assertion in rt_tables_get_rnh_ptr().  I'm not sure where this value - which from I assume came from userspace - should be handled?

I've pinged glebius@, he's done some Netlink work recently.
Comment 5 Gleb Smirnoff freebsd_committer freebsd_triage 2024-03-14 18:30:49 UTC
I guess the 255 is coming from sys/compat/linux/linux.c:linux_to_bsd_domain()
return (-1).  Can you please first modify your kernel so that it doesn't panic:
e.g. in netlink_io.c after line 284 just return as if the msg_from_linux
failed.  Then please add print of the actual value of domain in
linux_to_bsd_domain().
Comment 6 Edward Tomasz Napierala freebsd_committer freebsd_triage 2024-03-15 14:48:26 UTC
Thank you!  It's 17, which appears to be PF_PACKET.
Comment 7 Gleb Smirnoff freebsd_committer freebsd_triage 2024-03-15 18:10:12 UTC
Adding Dmitry and Alexander here.  TLDR version for them:

  An application (Firefox) sends NETLINK_ROUTE message with AF_PACKET
  in it.  linux_to_bsd_domain() fails to find an analog in FreeBSD,
  and returns 0xffffffff.  Later that truncates down to 0xff and 
  rt_tables_get_rnh_ptr() panics.

How should we fix that? At what level should we report EOPNOTSUPP (or
maybe other) error? My guess that should live in NetLink, cause it
is NetLink that doesn't check return value of linux_to_bsd_domain().
The latter honestly reports "I don't know".
Comment 8 Gleb Smirnoff freebsd_committer freebsd_triage 2024-03-15 18:30:06 UTC
Created attachment 249197 [details]
check if it is just a typo

Can you please try out this patch? Reverting any previous changes.
Comment 9 Gleb Smirnoff freebsd_committer freebsd_triage 2024-03-15 18:42:37 UTC
See also https://reviews.freebsd.org/D44375
Comment 10 Edward Tomasz Napierala freebsd_committer freebsd_triage 2024-03-16 14:18:05 UTC
Negative; the s/254/255/ patch doesn't seem to fix the panic I'm having.
Comment 11 Gleb Smirnoff freebsd_committer freebsd_triage 2024-03-16 16:22:21 UTC
Can you please apply both

https://reviews.freebsd.org/D44375
https://reviews.freebsd.org/D44392

and check if that fixes the panic?
Comment 12 Edward Tomasz Napierala freebsd_committer freebsd_triage 2024-03-18 16:56:14 UTC
Sorry, no joy - with those two applied it still panics like before, with similar backtrace.
Comment 13 Gleb Smirnoff freebsd_committer freebsd_triage 2024-03-18 17:21:20 UTC
On Mon Mar 18 16:56:14  2024 UTC, trasz@FreeBSD.org wrote:
> Sorry, no joy - with those two applied it still panics like before, with similar
> backtrace.

I have just updated both revisions and correct a mistake. Can you please try
again?
Comment 14 Edward Tomasz Napierala freebsd_committer freebsd_triage 2024-03-19 12:02:52 UTC
Bingo!  Those two fix my panic.  Thank you :)
Comment 15 commit-hook freebsd_committer freebsd_triage 2024-03-29 20:37:06 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=b977dd1ea5fbc2df3f1279330be4d089322eb2cf

commit b977dd1ea5fbc2df3f1279330be4d089322eb2cf
Author:     Gleb Smirnoff <glebius@FreeBSD.org>
AuthorDate: 2024-03-29 20:35:51 +0000
Commit:     Gleb Smirnoff <glebius@FreeBSD.org>
CommitDate: 2024-03-29 20:35:51 +0000

    linux: make linux_netlink_p->msg_from_linux be able to fail

    The KPI for this function was misleading.  From the NetLink perspective it
    looked like a function that: a) allocates new hdr, b) can fail.  Neither
    was true.  Let the function return a error code instead of returning the
    same hdr it was passed to.  In case if future Linux NetLink compatibility
    support calls for reallocating header, pass hdr as pointer to pointer.

    With KPI that returns a error, propagate domain conversion errors all the
    way up to NetLink module.  This fixes panic when unknown domain is
    converted to 0xff and this invalid value is passed into NetLink
    processing.

    PR:                     274536
    Reviewed by:            melifaro
    Differential Revision:  https://reviews.freebsd.org/D44392

 sys/compat/linux/linux_netlink.c | 58 ++++++++++++++++++++++++++--------------
 sys/netlink/netlink_io.c         | 22 +++++++--------
 sys/netlink/netlink_linux.h      |  2 +-
 3 files changed, 48 insertions(+), 34 deletions(-)
Comment 16 commit-hook freebsd_committer freebsd_triage 2024-03-29 20:37:11 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=9d4a08d162d87ba120f418a1a71facd2c631b549

commit 9d4a08d162d87ba120f418a1a71facd2c631b549
Author:     Gleb Smirnoff <glebius@FreeBSD.org>
AuthorDate: 2024-03-29 20:35:37 +0000
Commit:     Gleb Smirnoff <glebius@FreeBSD.org>
CommitDate: 2024-03-29 20:35:37 +0000

    linux: use sa_family_t for address family conversions

    Express "conversion failed" with maximum possible value.  This allows to
    reduce number of size/signedness conversion in the code that utilizes the
    functions.

    PR:                     274536
    Reviewed by:            melifaro
    Differential Revision:  https://reviews.freebsd.org/D44375

 sys/compat/linux/linux.c        | 18 +++++++++---------
 sys/compat/linux/linux_common.h |  5 +++--
 sys/compat/linux/linux_socket.c |  9 +++++----
 3 files changed, 17 insertions(+), 15 deletions(-)