Bug 272169 - net/realtek-re-kmod: panic when kldloaded outside of loader
Summary: net/realtek-re-kmod: panic when kldloaded outside of loader
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: amd64 Any
: --- Affects Many People
Assignee: Alex Dupre
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-06-23 20:01 UTC by Oleg
Modified: 2023-07-03 06:59 UTC (History)
2 users (show)

See Also:
bugzilla: maintainer-feedback? (ale)


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Oleg 2023-06-23 20:01:30 UTC
If I have these lines in my /boot/loader.conf:
if_re_load="YES"
if_re_name="/boot/modules/if_re.ko"
then everything is fine and I never experience any crashes. However, if I comment out those lines, reboot, and type "kldload /boot/modules/if_re.ko", then the computer immediately crashes and it doesn't even dump the info about the crash onto my swap partition and reboot. It simply freezes while displaying some info about the crash on the screen. You can see some of this crash info while looking at the photo I made: https://ibb.co/xMRJnyZ .
Comment 1 Alex Dupre freebsd_committer freebsd_triage 2023-06-26 13:15:05 UTC
Did you build a custom kernel without the bundled realtek re driver? Otherwise I think you should expect weird behavior if you load a kernel driver for a device that is already managed by the kernel.
Comment 2 Oleg 2023-06-26 13:28:42 UTC
The realtek driver that is already part of the kernel is not supported by my 2.5G Ethernet interface. Only the realtek-re-kmod works for me. As I said, I have no issues if loader.conf loads it. Trying to kldload it once the system is running causes a crash.
Comment 3 Alex Dupre freebsd_committer freebsd_triage 2023-06-26 13:36:50 UTC
That's clear, but they are both the same re kernel module, even if the bundled one doesn't actually support your card. It doesn't make sense to load a kernel module that cannot be used. To do that, you'll have to build a custom kernel without the re driver. At that point it shouldn't crash. If it does after having built a custom kernel then we have an issue.
Comment 4 Oleg 2023-06-26 13:43:45 UTC
So, I should build a custom kernel without the built-in re driver and then try to kldload realtek-re-kmod again? Okay, once I am home, I'll do it.
Comment 5 Alex Dupre freebsd_committer freebsd_triage 2023-06-26 13:46:45 UTC
Right, let me know the result.
Comment 6 Oleg 2023-06-26 18:59:31 UTC
(In reply to Alex Dupre from comment #5)

I still experience a crash when kldloading realtek-re-kmod if the kernel support for its internal re driver is removed. As far as I remember, this problem didn't exist in previous versions of realtek-re-kmod.
Comment 7 Alex Dupre freebsd_committer freebsd_triage 2023-06-28 07:46:35 UTC
@kib: do you have any idea what could be the issue?
Comment 8 Oleg 2023-06-28 10:55:48 UTC
(In reply to Alex Dupre from comment #7)

I don't have a serial console, so I can't get more information about the crash. The only additional piece of information that I am able to see when kldloading it on the efi framebuffer instead of the i915kms framebuffer is "panic: page fault". Otherwise, almost all other information looks the same as in the photo I provided earlier.
Comment 9 Oleg 2023-06-30 20:41:30 UTC
On 13.2-STABLE, kldloading the latest version of realtek-re-kmod doesn't produce any issues. But there will always be a crash on 14-CURRENT. (It doesn't matter if the kernel is GENERIC or GENERIC-NODEBUG). So, I was a little bit incorrect earlier when I implied that the earlier versions of realtek-re-kmod didn't have problems, but the latest one does. In reality, I could kldload realtek-re-kmod under earlier versions of 14-CURRENT without encountering a problem, but not under the latest versions of 14-CURRENT. It's a good thing the loader.conf method of loading it still works on 14-CURRENT.
Comment 10 Konstantin Belousov freebsd_committer freebsd_triage 2023-06-30 22:45:41 UTC
Without proper information from the crash, I do not see how could I proceed.
Comment 11 Oleg 2023-06-30 23:42:49 UTC
(In reply to Konstantin Belousov from comment #10)

Usually, when a crash occurs, the information about it gets dumped onto my swap partition, my computer reboots, and then I am able to access this information in /var/crash. But in this case, since the computer simply freezes while displaying some vague information about the crash, I guess I would need access to a serial console, which I don't have. But maybe you guys know someone who has access to a serial console, runs the latest 14-CURRENT, and has the need to use realtek-re-kmod. That someone will be easily able to reproduce this bug and give you proper information about the crash.
Comment 12 Oleg 2023-07-01 12:05:17 UTC
I changed the value of the sysctl variable debug.debugger_on_panic from 0 to 1, so, after the if_re-related crash occurred, the db prompt appeared and I typed dump and the information about the crash was dumped. Since typing reboot didn't do anything, I had to use the power button to shutdown and start the computer again. I didn't do this whole procedure earlier because I don't have a lot of knowledge about FreeBSD software. But this is the crash information and hopefully, it will help you find the source of the problem:

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:59
59		__asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:59
#1  doadump (textdump=textdump@entry=0)
    at /usr/src/sys/kern/kern_shutdown.c:407
#2  0xffffffff805a3e66 in db_dump (dummy=<optimized out>, 
    dummy2=<optimized out>, dummy3=<optimized out>, dummy4=<optimized out>)
    at /usr/src/sys/ddb/db_command.c:593
#3  0xffffffff805a3b85 in db_command (last_cmdp=<optimized out>, 
    cmd_table=<optimized out>, dopager=true)
    at /usr/src/sys/ddb/db_command.c:506
#4  0xffffffff805a367d in db_command_loop ()
    at /usr/src/sys/ddb/db_command.c:553
#5  0xffffffff805a8da9 in db_trap (type=<optimized out>, code=<optimized out>)
    at /usr/src/sys/ddb/db_main.c:270
#6  0xffffffff8122ab3e in kdb_trap (type=3, code=code@entry=0, 
    tf=tf@entry=0xfffffe0307d745f0) at /usr/src/sys/kern/subr_kdb.c:784
#7  0xffffffff81a767ab in trap (frame=0xfffffe0307d745f0)
    at /usr/src/sys/amd64/amd64/trap.c:610
#8  <signal handler called>
#9  kdb_enter (why=<optimized out>, msg=<optimized out>)
    at /usr/src/sys/kern/subr_kdb.c:550
#10 0xffffffff81182728 in vpanic (
    fmt=fmt@entry=0xffffffff82432220 <str> "%s", 
    ap=ap@entry=0xfffffe0307d747e0) at /usr/src/sys/kern/kern_shutdown.c:960
#11 0xffffffff81182425 in panic (fmt=0xffffffff82432220 <str> "%s")
    at /usr/src/sys/kern/kern_shutdown.c:896
#12 0xffffffff81a76eb8 in trap_fatal (frame=0xfffffe0307d74c50, eva=0)
    at /usr/src/sys/amd64/amd64/trap.c:954
#13 0xffffffff81a76feb in trap_pfault (frame=frame@entry=0xfffffe0307d74c50, 
    usermode=false, signo=signo@entry=0x0, ucode=ucode@entry=0x0)
    at /usr/src/sys/amd64/amd64/trap.c:765
#14 0xffffffff81a75f33 in trap (frame=0xfffffe0307d74c50)
    at /usr/src/sys/amd64/amd64/trap.c:444
#15 <signal handler called>
#16 0x0000000000000000 in ?? ()
#17 0xffffffff813cb240 in ifmedia_ioctl (ifp=0xfffffe01f5b36000, 
    ifr=0xfffffe0307d74f10, ifm=0xfffffe01fa06a040, cmd=<optimized out>)
    at /usr/src/sys/net/if_media.c:295
#18 0xffffffff854b8afa in re_ioctl () from /boot/modules/if_re.ko
#19 0xffffffff815e99e2 in get_operstate_ether (ifp=0xfffffe01f5b36000, 
    pstate=<optimized out>) at /usr/src/sys/netlink/route/iface.c:127
#20 get_operstate (ifp=0xfffffe01f5b36000, pstate=<optimized out>)
    at /usr/src/sys/netlink/route/iface.c:184
#21 dump_iface (nw=nw@entry=0xfffffe0307d75020, 
    ifp=ifp@entry=0xfffffe01f5b36000, hdr=hdr@entry=0xfffffe0307d75000, 
    if_flags_mask=if_flags_mask@entry=0)
    at /usr/src/sys/netlink/route/iface.c:312
#22 0xffffffff815e918d in rtnl_handle_ifevent (ifp=0xfffffe01f5b36000, 
    nlmsg_type=16, if_flags_mask=0) at /usr/src/sys/netlink/route/iface.c:1409
#23 0xffffffff813ae006 in if_attach_internal (
    ifp=ifp@entry=0xfffffe01f5b36000, vmove=<optimized out>)
    at /usr/src/sys/net/if.c:958
#24 0xffffffff813ad8c9 in if_attach (ifp=ifp@entry=0xfffffe01f5b36000)
    at /usr/src/sys/net/if.c:773
#25 0xffffffff813c1c6e in ether_ifattach (ifp=0xfffffe01f5b36000, 
    lla=0xfffffe0307d752a2 "\004B\032", <incomplete sequence \345\036>)
    at /usr/src/sys/net/if_ethersubr.c:1002
#26 0xffffffff854b301e in re_attach () from /boot/modules/if_re.ko
#27 0xffffffff8120b854 in DEVICE_ATTACH (dev=0xfffffe01e6077d00)
    at ./device_if.h:195
#28 device_attach (dev=dev@entry=0xfffffe01e6077d00)
    at /usr/src/sys/kern/subr_bus.c:2537
#29 0xffffffff8120b133 in device_probe_and_attach (
    dev=dev@entry=0xfffffe01e6077d00) at /usr/src/sys/kern/subr_bus.c:2494
#30 0xffffffff80be4ce3 in pci_driver_added (dev=0xfffffe01e6077e00, 
    driver=<optimized out>) at /usr/src/sys/dev/pci/pci.c:4732
#31 0xffffffff81206d94 in BUS_DRIVER_ADDED (_dev=0xfffffe01e6077e00, 
    _driver=0xffffffff85531510 <re_driver>) at ./bus_if.h:210
#32 devclass_driver_added (dc=0xfffffe01a07a0500, 
    driver=driver@entry=0xffffffff85531510 <re_driver>)
    at /usr/src/sys/kern/subr_bus.c:605
#33 0xffffffff81206c51 in devclass_add_driver (dc=0xfffffe01a07a0500, 
    driver=0xffffffff85531510 <re_driver>, pass=<optimized out>, dcp=0x0)
    at /usr/src/sys/kern/subr_bus.c:692
#34 0xffffffff8114289d in module_register_init (
    arg=0xffffffff855314c8 <if_re_pci_mod>)
    at /usr/src/sys/kern/kern_module.c:123
#35 0xffffffff81125b1d in linker_file_sysinit (lf=<optimized out>)
    at /usr/src/sys/kern/kern_linker.c:243
#36 linker_load_file (filename=<optimized out>, result=<optimized out>)
    at /usr/src/sys/kern/kern_linker.c:476
#37 linker_load_module (
    kldname=kldname@entry=0xfffffe01f9dc5400 "/boot/modules/if_re.ko", 
    modname=0x0, parent=parent@entry=0x0, verinfo=verinfo@entry=0x0, 
    lfpp=lfpp@entry=0xfffffe0307d75c40)
    at /usr/src/sys/kern/kern_linker.c:2205
#38 0xffffffff81128b23 in kern_kldload (td=<optimized out>, 
    file=file@entry=0xfffffe01f9dc5400 "/boot/modules/if_re.ko", 
    fileid=fileid@entry=0xfffffe0307d75ce0)
    at /usr/src/sys/kern/kern_linker.c:1164
#39 0xffffffff81128d52 in sys_kldload (td=0xfffffe01f5b36000, 
    uap=<optimized out>) at /usr/src/sys/kern/kern_linker.c:1187
#40 0xffffffff81a78031 in syscallenter (td=<optimized out>)
    at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:190
#41 amd64_syscall (td=0xfffffe03070ce000, traced=0)
    at /usr/src/sys/amd64/amd64/trap.c:1199
#42 <signal handler called>
#43 0x00002b0bbea8d18a in ?? ()
Backtrace stopped: Cannot access memory at address 0x2b0bbd499528
(kgdb)
Comment 13 Oleg 2023-07-01 13:23:35 UTC
I commented out line 295 in /usr/src/sys/net/if_media.c: "(*ifm->ifm_status)(ifp, ifmr);" because it was mentioned in the crash report and now I can kldload /boot/modules/if_re.ko without experiencing any crashes. I have no idea what that line does and if I introduced any bugs by commenting it out, but, as I said, after I did it, if_re.ko stopped crashing the system.
Comment 14 Konstantin Belousov freebsd_committer freebsd_triage 2023-07-01 20:48:20 UTC
Do you use the binary package for module?  As the first thing to try, ensure
that your kernel is built from exact sources you installed, and then rebuilt
the driver module against the sources and kernel config:
 $ make SYSDIR=/usr/src/sys KERNBUILDDIR=/usr/obj/usr/src/amd64.amd64/sys/GENERIC
<adjust for your config>
Comment 15 Oleg 2023-07-01 21:48:15 UTC
Yes, earlier, I built realtek-re-kmod against kernel sources. But I just did what you told me to do just to make sure I took all the right steps. "pkg delete realtek-re-kmod", "cd /usr/src", "git restore *", "git pull", "make kernel", "cd /usr/ports/net/realtek-re-kmod", and typed the command you told me to type. Then I typed "make install clean." I rebooted, typed "kldload /boot/modules/if_re.ko", and the computer crashed. As previously mentioned, on 13.2-STABLE, when I built realtek-re-kmod against 13.2-STABLE sources, kldloading realtek-re-kmod didn't crash the system. But on 14-CURRENT, the system crashes after I attempt to kldload it. The loader.conf method of loading it doesn't cause any issues for me on 14-CURRENT. But I showed you parts of my /var/crash/core.txt.0. There is nothing there that tells you what caused the crash?
Comment 16 Konstantin Belousov freebsd_committer freebsd_triage 2023-07-01 22:42:57 UTC
(In reply to Oleg from comment #15)
Ok, this seems right. So it is probably not an ABI problem.

Please crash you machine again, take the vmcore.  Then, in kgdb,
from the frame for ifmedia_ioctl() (the last valid frame before the
trap frame) do the following:
(kgdb) info locals
(kgdb) p *ifm
Comment 17 Konstantin Belousov freebsd_committer freebsd_triage 2023-07-01 22:58:56 UTC
Ok, I suspect I understand what the issue is.  Please try the patch from
https://github.com/alexdupre/rtl_bsd_drv/pull/3
Comment 18 Oleg 2023-07-01 23:25:31 UTC
Yes, that patch fixed the issue. Thank you very much!
Comment 19 commit-hook freebsd_committer freebsd_triage 2023-07-03 06:58:39 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=17214a37d72b7f35998426d9990591971a5082c4

commit 17214a37d72b7f35998426d9990591971a5082c4
Author:     Alex Dupre <ale@FreeBSD.org>
AuthorDate: 2023-07-03 06:55:25 +0000
Commit:     Alex Dupre <ale@FreeBSD.org>
CommitDate: 2023-07-03 06:57:57 +0000

    net/realtek-re-kmod: fix panic when kldloaded outside of loader

    PR:             272169
    Reported by:    Oleg <oleglelchuk@gmail.com>
    Submitted by:   kib

 net/realtek-re-kmod/Makefile | 4 ++--
 net/realtek-re-kmod/distinfo | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)
Comment 20 Alex Dupre freebsd_committer freebsd_triage 2023-07-03 06:59:05 UTC
Patch committed, thanks!