https://lists.freebsd.org/archives/freebsd-current/2025-May/007756.html
For recent CURRENT, probably drm-66-kmod or at least drm-61-kmod should be used nowadays.
Created attachment 260809 [details] dmesg.boot showing the radeonkms loading info
(In reply to Marek Zarychta from comment #1) Both drm-61 and drm-66 end up in an endless reboot loop upon loading radeonkms.ko. That is, boot -> load radeonkms.ko -> reboot drm-515 and current circa february 2025 work fine. Something has changed in the pass few weeks. When scanning https://lists.freebsd.org/archives/dev-commits-src-main/ nothing jumps out as a problematic commit.
% pciconf -vl ... vgapci0@pci0:1:0:0: class=0x030000 rev=0x00 hdr=0x00 vendor=0x1002 device=0x6779 subvendor=0x1092 subdevice=0x6450 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'Caicos [Radeon HD 6450/7450/8450 / R5 230 OEM]' class = display subclass = VGA
Looking for closely at dmesg.boot, one finds the initial reporting for loading radeonkms.ko [drm] radeon kernel modesetting enabled. drmn0: <drmn> on vgapci0 vgapci0: child drmn0 requested pci_enable_io vgapci0: child drmn0 requested pci_enable_io sysctl_add_oid: can't re-use a leaf (hw.dri.debug)! [drm] initializing kernel modesetting (CAICOS 0x1002:0x6779 0x1092:0x6450 0x00). Hmmm, that looks suspicious, and now looking at the qinitial part of core.txt.1 panic: pfs_add_node(): homonymous siblings Reading symbols from /usr/lib/debug//boot/kernel/cpuctl.ko.debug... __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57 57 __asm("movq %%gs:%c1,%0" : "=r" (td) (kgdb) #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57 Learned a new word today. homonymous /hō-mŏn′ə-məs, hə-/ adjective Having the same name.
That's a bit odd.... I have almost the same graphics card, and it runs perfectly on CURRENT, including suspend/resume support. To prevent problems, I always build kmnods after installkernel. Probably the problem is somewhere between CPU and graphics, since I am using this card with Intel processor. vgapci0@pci0:1:0:0: class=0x030000 rev=0x00 hdr=0x00 vendor=0x1002 device=0x6778 subvendor=0x1028 subdevice=0x2120 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'Caicos XT [Radeon HD 7470/8470 / R5 235/310 OEM]' class = display subclass = VGA
(In reply to Marek Zarychta from comment #6) The full updating dance for me % cd /usr/ports % git pull -ff % cd ../src % git pull -ff % make -j7 buildworld % make -j7 buildkernel % make installkernel <reboot to single user> # mount -a # etcupdate -p # make installworld # etcupdate -B # make delete-old # vi /etc/rc.conf (comment out kld_list) # sync # reboot % pkg delete -f drm-515-kmod % pkg delete -f gpu-firmware\* % cd /usr/ports/graphics/drm-515-kmod % make -j7 && make install && make clean % cd ../gpu-firmware-radeon-kmod % make -j7 && make install && make clean % vi /etc/rc.conf (restore kld_list) % shutdown -r now Log in run startx, watch system panic. The panic occurs with both a custom kernel and GENERIC. In an odd twist of fate, the kernel crash dump that I have was the first time the system panicked. With all other panics, the system simply hangs with a Black screen and hitting the reset button is required.
(In reply to Steve Kargl from comment #7) I am sorry, I am not to judge whether the upgrade procedure is 100% or 101% correct. I am trying to help here since the graphics are similar. Driven by curiosity, I have just upgraded to the most recent CURRENT, replaced drm-61-kmod with drm-66-kmod, rebooted using UEFI and then BIOS methods, and everything seems to work fine. TBH, my upgrade procedure is simplified: installkernel and installworld (over NFS), etcupdate, pkg upgrade, portupgrade drm-66-kmod and then reboot. Please let me paste excerpts from the dmesg below - there are a few errors, they are similar but not fatal. What I have noticed, when booting from UEFI the screen is blank for short period of time (still in text mode), when booting with BIOS method, only the resolution changes, but the screen doesn't go blank during boot. From dmesg(8), this time booted from legacy BIOS: [drm] radeon kernel modesetting enabled. drmn0: <drmn> on vgapci0 vgapci0: child drmn0 requested pci_enable_io vgapci0: child drmn0 requested pci_enable_io sysctl_add_oid: can't re-use a leaf (hw.dri.debug)! [drm] initializing kernel modesetting (CAICOS 0x1002:0x6778 0x1028:0x2120 0x00). [drm ERROR :radeon_atombios_init] Unable to find PCI I/O BAR; using MMIO for ATOM IIO ATOM BIOS: C26411 drmn0: VRAM: 1024M 0x0000000000000000 - 0x000000003FFFFFFF (1024M used) drmn0: GTT: 1024M 0x0000000040000000 - 0x000000007FFFFFFF [drm] Detected VRAM RAM=1024M, BAR=256M [drm] RAM width 64bits DDR [drm] radeon: 1024M of VRAM memory ready [drm] radeon: 1024M of GTT memory ready. [drm] Loading CAICOS Microcode radeon/CAICOS_pfp.bin: could not load binary firmware /boot/firmware/radeon/CAICOS_pfp.bin either CAICOS_pfp.bin: could not load binary firmware /boot/firmware/CAICOS_pfp.bin either radeon_CAICOS_pfp.bin: could not load binary firmware /boot/firmware/radeon_CAICOS_pfp.bin either drmn0: successfully loaded firmware image 'radeon/CAICOS_pfp.bin' radeon/CAICOS_me.bin: could not load binary firmware /boot/firmware/radeon/CAICOS_me.bin either CAICOS_me.bin: could not load binary firmware /boot/firmware/CAICOS_me.bin either radeon_CAICOS_me.bin: could not load binary firmware /boot/firmware/radeon_CAICOS_me.bin either drmn0: successfully loaded firmware image 'radeon/CAICOS_me.bin' radeon/BTC_rlc.bin: could not load binary firmware /boot/firmware/radeon/BTC_rlc.bin either BTC_rlc.bin: could not load binary firmware /boot/firmware/BTC_rlc.bin either radeon_BTC_rlc.bin: could not load binary firmware /boot/firmware/radeon_BTC_rlc.bin either drmn0: successfully loaded firmware image 'radeon/BTC_rlc.bin' radeon/CAICOS_mc.bin: could not load binary firmware /boot/firmware/radeon/CAICOS_mc.bin either CAICOS_mc.bin: could not load binary firmware /boot/firmware/CAICOS_mc.bin either radeon_CAICOS_mc.bin: could not load binary firmware /boot/firmware/radeon_CAICOS_mc.bin either drmn0: successfully loaded firmware image 'radeon/CAICOS_mc.bin' radeon/CAICOS_smc.bin: could not load binary firmware /boot/firmware/radeon/CAICOS_smc.bin either CAICOS_smc.bin: could not load binary firmware /boot/firmware/CAICOS_smc.bin either radeon_CAICOS_smc.bin: could not load binary firmware /boot/firmware/radeon_CAICOS_smc.bin either drmn0: successfully loaded firmware image 'radeon/CAICOS_smc.bin' [drm] Internal thermal controller without fan control [drm] radeon: dpm initialized radeon/SUMO_uvd.bin: could not load binary firmware /boot/firmware/radeon/SUMO_uvd.bin either SUMO_uvd.bin: could not load binary firmware /boot/firmware/SUMO_uvd.bin either radeon_SUMO_uvd.bin: could not load binary firmware /boot/firmware/radeon_SUMO_uvd.bin either drmn0: successfully loaded firmware image 'radeon/SUMO_uvd.bin' [drm] GART: num cpu pages 262144, num gpu pages 262144 [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0 [drm] PCIE GART of 1024M enabled (table at 0x0000000000162000). drmn0: WB enabled drmn0: fence driver on ring 0 use gpu addr 0x0000000040000c00 drmn0: fence driver on ring 3 use gpu addr 0x0000000040000c0c drmn0: fence driver on ring 5 use gpu addr 0x0000000000072118 drmn0: radeon: MSI limited to 32-bit drmn0: radeon: using MSI. [drm] radeon: irq initialized. [drm] ring test on 0 succeeded in 3 usecs [drm] ring test on 3 succeeded in 8 usecs [drm] ring test on 5 succeeded in 2 usecs [drm] UVD initialized successfully. [drm] ib test on ring 0 succeeded in 0 usecs [drm] ib test on ring 3 succeeded in 0 usecs [drm] ib test on ring 5 succeeded lkpi_iicbb0: <LinuxKPI I2CBB> on drmn0 (...) [drm] Radeon Display Connectors [drm] Connector 0: [drm] DP-1 [drm] HPD2 [drm] DDC: 0x6460 0x6460 0x6464 0x6464 0x6468 0x6468 0x646c 0x646c [drm] Encoders: [drm] DFP1: INTERNAL_UNIPHY1 [drm] Connector 1: [drm] DVI-I-1 [drm] HPD4 [drm] DDC: 0x6450 0x6450 0x6454 0x6454 0x6458 0x6458 0x645c 0x645c [drm] Encoders: [drm] DFP2: INTERNAL_UNIPHY [drm] CRT1: INTERNAL_KLDSCP_DAC1 [drm] Initialized radeon 2.50.0 20080528 for drmn0 on minor 0 [drm] fb mappable at 0xE0363000 [drm] vram apper at 0xE0000000 [drm] size 7299072 [drm] fb depth is 24 [drm] pitch is 6912 VT: Replacing driver "vga" with new "drmfb". start FB_INFO: height=1024 width=1280 depth=32 pbase=0xe0363000 vbase=0xfffff800e0363000 name=drmn0 id=radeondrmfb flags=0x0 stride=6912 end FB_INFO
Taking a relatively giant step backwards, I have downgraded from a once function radeonkms.ko from drm-515-kmod to the vesa driver. A git bisection of both /usr/src and /usr/ports is likely to take awhile. 1) I need to learn how to the bisection and 2) I need to backup to mid-february with a guess at git hash.
(In reply to Steve Kargl from comment #5) Do you have a backtrace in your core.txt.1 or can you make the file available?
(In reply to Bjoern A. Zeeb from comment #10) I have core.txt.1, vmcore.1, and info.1 as well as *.2 files. Unfortunely, I've built and installed dozen of kernels and have lost the kernel.debug files. I can upload the files to my home directory kargl@freefall.freebsd.org later today. I'll try adding the dump_stack() call you mentioned in another email to see if I can get additional information.
(In reply to Steve Kargl from comment #11) Bjoern, I have uploaded the *.0 files to the directory drm/ in my home directory on freefall (aka kargl/drm). Note, I added the dump_stack() call and it appears in core.txt.0, and updated to include your recent change to output the name. Finally, I saved a copy of /usr/lib/debug/boot, so I have the *.debug files but have not uploaded those, yet. Let me know if you need those.
(In reply to Steve Kargl from comment #12) So my assumption was correct? We are going twice through evergreen_startup(). ... drmn0: radeon: using MSI. [drm] radeon: irq initialized. #0 0xffffffff808bcdeb at linux_dump_stack+0x1b #1 0xffffffff82a67adc at evergreen_startup+0x15ec #2 0xffffffff82a67fb6 at evergreen_init+0x276 #3 0xffffffff82abdc35 at radeon_device_init+0x835 #4 0xffffffff82acebbe at radeon_driver_load_kms+0x19e #5 0xffffffff82ba4147 at drm_dev_register+0x1c7 #6 0xffffffff82ac4cdc at radeon_pci_probe+0x15c #7 0xffffffff808c5020 at linux_pci_attach_device+0x440 #8 0xffffffff806ac61a at device_attach+0x3fa #9 0xffffffff806ae370 at bus_generic_driver_added+0x90 #10 0xffffffff806a9ba9 at devclass_driver_added+0x29 #11 0xffffffff806a9ac8 at devclass_add_driver+0x138 #12 0xffffffff808c5f51 at _linux_pci_register_driver+0xc1 #13 0xffffffff82ac4b4e at radeonkms_evh+0x3e #14 0xffffffff80652be0 at module_register_init+0xb0 #15 0xffffffff80642e0b at linker_load_module+0xbeb #16 0xffffffff80644b25 at kern_kldload+0x125 #17 0xffffffff80644bb9 at sys_kldload+0x59 [drm] ring test on 0 succeeded in 4 usecs [drm] ring test on 3 succeeded in 6 usecs [drm] ring test on 5 succeeded in 3 usecs [drm] UVD initialized successfully. [drm] ib test on ring 0 succeeded in 0 usecs [drm] ib test on ring 3 succeeded in 0 usecs [drm] ib test on ring 5 succeeded ... ... [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0 [drm] PCIE GART of 1024M enabled (table at 0x0000000000162000). drmn0: WB enabled drmn0: fence driver on ring 0 use gpu addr 0x0000000040000c00 drmn0: fence driver on ring 3 use gpu addr 0x0000000040000c0c drmn0: fence driver on ring 5 use gpu addr 0x0000000000072118 #0 0xffffffff808bcdeb at linux_dump_stack+0x1b #1 0xffffffff82a67adc at evergreen_startup+0x15ec #2 0xffffffff82a66333 at evergreen_resume+0x63 #3 0xffffffff82abeea0 at radeon_gpu_reset+0x290 #4 0xffffffff82ac9ea8 at radeon_gem_wait_idle_ioctl+0xb8 #5 0xffffffff82bb4f16 at drm_ioctl_kernel+0xc6 #6 0xffffffff82bb528d at drm_ioctl+0x29d #7 0xffffffff808ba3b1 at linux_file_ioctl+0x301 #8 0xffffffff806e495e at kern_ioctl+0x1de #9 0xffffffff806e471f at sys_ioctl+0x12f #10 0xffffffff80a01e4e at amd64_syscall+0x13e #11 0xffffffff809d4ccb at fast_syscall_common+0xf8 panic: pfs_add_node(): homonymous siblings: 'radeon_ring_gfx' type 5 I am adding dumbbell to Cc: Someone needs to figure out where in all this the cleanup does not happen. I do not see any debugfs_remove*() calls in amdgpu; there's likely other code paths to get to there (or possibly other KPI functions which would cleanup the pfs). The other question I cannot answer yet, is where is CONFIG_DEBUG_FS turned on for the build so that you hit this code path in first place. I do not know how the ./kconfig.mk: DEBUG_FS \ magic works. It came in with: --- commit aec60ec819e1d8aa7961be2effef3f6b22741486 Author: Jake Freeland <jfree@freebsd.org> AuthorDate: Mon Oct 10 18:13:11 2022 -0500 Commit: Emmanuel Vadot <manu@bidouilliste.com> CommitDate: Tue Oct 11 09:43:23 2022 +0200 Add support for CONFIG_DEBUG_FS build flag --- Hmmm Ok. Entirely untested: could you try this patch? diff --git radeon/Makefile radeon/Makefile index f731eb961e..788e1fbb77 100644 --- radeon/Makefile +++ radeon/Makefile @@ -129,7 +129,7 @@ CFLAGS+= -I${SRCDIR:H}/amd/include CFLAGS+= '-DKBUILD_MODNAME="${KMOD}"' CFLAGS+= '-DLINUXKPI_PARAM_PREFIX=radeon_' -DDRM_SYSCTL_PARAM_PREFIX=_${KMOD} -CFLAGS+= ${KCONFIG:C/(.*)/-DCONFIG_\1/} +CFLAGS+= ${KCONFIG:NDEBUG_FS:C/(.*)/-DCONFIG_\1/} CFLAGS.gcc+= -Wno-redundant-decls -Wno-cast-qual -Wno-unused-but-set-variable \ -Wno-maybe-uninitialized
bz, thanks for looking into the issue. I've managed to backup to "git checkout 'main@{2025-03-15 12:00:00}'", which has hash d3c4b002d. After rebuilding and re-install world/kernel, and rebuild gpu-firmware and drm-515-kmod, I can successfully load radeonkms.ko and run startx. The desktop I expect comes up. So, whatever is causing the issue appears in src/ after the above date. I'll move forward to 2025-04-15. It takes 3-4 hours to rebuild everything.
(In reply to Steve Kargl from comment #14) if that does not work I would go backwards trying 62d51a43825bb632f542f4e89d57f3dbdb08095f (Apr 11) 86db734ae292fee58532f09b17b50438f6889cc8 (Apr 3) both are followed by a set of LinuxKPI changes so you'd back these out which may help to narrow it down with the manual bisect. HTH
I have been able to build world/kernel from src/ of 2025-04-15 (aka adc33d3288). I rebuilt drm-515-kmod and gpu-firmware-kmod. After reboot system, I can now kldload radeonkms.ko. startx brings up the expected desktop. I now rebuilding src/ circa 2025-05-01 (aka 8d136fb027).
Hi! I have a hard time to follow between the mailing list thread and this problem report, so let me try to rephrase: 1. There is a panic in pseudofs because the radeon driver wants to declase two entries with the same name. 2. It looks like this double attempt comes from the fact that the same code path is executed twise during init. The problem appeared between commits 9b2a503a1179 and 6c3a4b5f9b7b in freebsd-src HEAD. Am I correct? About the panic, it looks like it has been addressed by a commit from kib@: https://cgit.freebsd.org/src/commit/?id=e9897199576a40360440aa4d2aa48d61c4010f11 That dosn’t change the fact the initialization is apparently called twice. However I can’t find the dmesgs that are mentionned during the discussions where the dump_stack() traces appear, demonstration the two inits. Could you please attach these dmesgs here? Thank you!
(In reply to Jean-Sébastien Pédron from comment #17) See comment #11 (it's on freefall believe)
(In reply to Jean-Sébastien Pédron from comment #17) In the "good" case where I can load radeonkms.ko and use startx to bring up my desktop, the driver is only initialized **once**. In the "bad" case, after loading radeonkms.ko, the use of startx causes a panic. The panic appears to be due to an attempt to initialize the driver a second time. kib's patch addresses the panic, but does not address why the initialization of the driver is occurring **twice**. I reduced the range of commits to the range you quoted. Unfortunately, slow hardware and keeping world/kernel in sync is taking a long time to find the commit that is causing the actual problem. Look for the directory kargl/drm on freefall for crash dump
I found it. It's markj's commit about jiffies. The four consecutive commits are 4fa275a5f357 - main - queue(3): Add simple tests for some macros... Olivier Certner 325aa4dbd10d - main - linuxkpi: Introduce a properly typed jiffies Mark Johnston 901256f6ea3c - main - mlx5: jiffies is unsigned long Mark Johnston 87e57632bf88 - main - ofed: jiffies is unsigned long Mark Johnston 4fa275a boots, I can kldload radeonkms.ko, and startx brings up my desktop. In fact, I'm typing this in firefox at the moment. There is only dump_stack() message from evergreen.c 87e5763 boots, I can kldload radeonkmd.ko, and startx causes a panic. My custom kernel uses neither mlx5 nor ofed. That leaves 325aa4d as the commit causing an issue. There are two dump_stack() message from evergreen.c in /var/crash/core.txt.3
Created attachment 261141 [details] proposed patch Steve, can you please test this patch to drm-kmod and let us know if it fixes the problem when my commits are reapplied?
(In reply to Mark Johnston from comment #21) Mark, can you elaborate? That's an E1000 patch. How does that has impact on drm-kmod? Wrong patch file?
Created attachment 261142 [details] proposed patch Sigh, thanks Bjoern, I meant this one. It's already applied to 6.x branches, but not 5.15 for some reason.
(In reply to Mark Johnston from comment #23) Mark, your patch seems to fix the issue. I built and installed 87e57632, and rebooted system. Then, updated gpu-firmware and drm-515-kmod. After kldload of radeonkms.ko., startx brought up the expected desktop. The dump_stack() call in evergreen.c was executed only once, so initialization only occurs once.