Bug 287367 - [amdgpu][drm-6.6] panic: Assertion td->td_lkpi_task == NULL
Summary: [amdgpu][drm-6.6] panic: Assertion td->td_lkpi_task == NULL
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 15.0-CURRENT
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-x11 (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-06-07 20:43 UTC by Axel.Rau
Modified: 2025-11-20 17:16 UTC (History)
4 users (show)

See Also:


Attachments
dmesg (13.16 KB, text/plain)
2025-06-07 20:54 UTC, Axel.Rau
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Axel.Rau 2025-06-07 20:43:07 UTC
After installing FreeBSD 15.0-CURRENT #2 main-n277727-fa02d9fceab7
and drm-66-kmod: 6.6.25.1500043_3 [FreeBSD-kmods], I see this panic on kld amdgpu:

```
...
drmn0: successfully loaded firmware image 'amdgpu/yellow_carp_vcn.bin'
[drm] JPEG decode is enabled in VM mode
drmn0: Trusted Memory Zone (TMZ) feature disabled as experimental (default)
drmn0: PCIE atomic ops is not supported
[drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
drmn0: VRAM: 1024M 0x000000F400000000 - 0x000000F43FFFFFFF (1024M used)
drmn0: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
drmn0: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF
[drm ERROR :amdgpu_bo_init] Unable to set WC memtype for the aperture base
[drm] Detected VRAM RAM=1024M, BAR=1024M
[drm] RAM width 64bits DDR5
[drm] amdgpu: 1024M of VRAM memory ready
[drm] amdgpu: 7536M of GTT memory ready.
[drm] GART: num cpu pages 262144, num gpu pages 262144
[drm] PCIE GART of 1024M enabled (table at 0x000000F43FC00000).
[drm] Loading DMUB firmware via PSP: version=0x0400003C
amdgpu/yellow_carp_sdma.bin: could not load binary firmware /boot/firmware/amdgpu/yellow_carp_sdma.bin either
yellow_carp_sdma.bin: could not load binary firmware /boot/firmware/yellow_carp_sdma.bin either
amdgpu_yellow_carp_sdma.bin: could not load binary firmware /boot/firmware/amdgpu_yellow_carp_sdma.bin either
drmn0: successfully loaded firmware image 'amdgpu/yellow_carp_sdma.bin'
[drm] use_doorbell being set to: [true]
[drm] Found VCN firmware Version ENC: 1.27 DEC: 2 VEP: 0 Revision: 0
drmn0: Will use PSP to load VCN firmware
[drm] reserve 0xa00000 from 0xf43e000000 for PSP TMR
drmn0: RAS: optional ras ta ucode is not available
drmn0: RAP: optional rap ta ucode is not available
drmn0: SECUREDISPLAY: securedisplay ta ucode is not available
drmn0: SMU is initialized successfully!
panic: Assertion td->td_lkpi_task == NULL failed at /usr/src/sys/compat/linuxkpi/common/src/linux_current.c:85
cpuid = 6
time = 1749326364
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00d1eb0298
vpanic() at vpanic+0x136/frame 0xfffffe00d1eb03c8
panic() at panic+0x43/frame 0xfffffe00d1eb0428
linux_alloc_current() at linux_alloc_current+0x314/frame 0xfffffe00d1eb0488
dc_assert_fp_enabled() at dc_assert_fp_enabled+0x77/frame 0xfffffe00d1eb04a8
dcn31_update_bw_bounding_box() at dcn31_update_bw_bounding_box+0x1b/frame 0xfffffe00d1eb04d0
dc_create() at dc_create+0x3aa/frame 0xfffffe00d1eb0510
dm_hw_init() at dm_hw_init+0x367/frame 0xfffffe00d1eb0700
amdgpu_device_ip_hw_init_phase2() at amdgpu_device_ip_hw_init_phase2+0x5a/frame 0xfffffe00d1eb0730
amdgpu_device_ip_init() at amdgpu_device_ip_init+0x3cd/frame 0xfffffe00d1eb07a0
amdgpu_device_init() at amdgpu_device_init+0x1bc8/frame 0xfffffe00d1eb0860
amdgpu_driver_load_kms() at amdgpu_driver_load_kms+0x16/frame 0xfffffe00d1eb0890
amdgpu_pci_probe() at amdgpu_pci_probe+0x29f/frame 0xfffffe00d1eb08e0
linux_pci_attach_device() at linux_pci_attach_device+0x440/frame 0xfffffe00d1eb0930
device_attach() at device_attach+0x45b/frame 0xfffffe00d1eb0980
bus_generic_driver_added() at bus_generic_driver_added+0x90/frame 0xfffffe00d1eb09a0
devclass_driver_added() at devclass_driver_added+0x29/frame 0xfffffe00d1eb09d0
devclass_add_driver() at devclass_add_driver+0x138/frame 0xfffffe00d1eb0a10
_linux_pci_register_driver() at _linux_pci_register_driver+0xc1/frame 0xfffffe00d1eb0a40
amdgpu_evh() at amdgpu_evh+0x73/frame 0xfffffe00d1eb0a50
module_register_init() at module_register_init+0xb0/frame 0xfffffe00d1eb0a80
linker_load_module() at linker_load_module+0xc51/frame 0xfffffe00d1eb0d80
kern_kldload() at kern_kldload+0x16e/frame 0xfffffe00d1eb0dd0
sys_kldload() at sys_kldload+0x59/frame 0xfffffe00d1eb0e00
amd64_syscall() at amd64_syscall+0x169/frame 0xfffffe00d1eb0f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00d1eb0f30
--- syscall (304, FreeBSD ELF64, kldload), rip = 0x1bcf2d5f2baa, rsp = 0x1bcf2b363eb8, rbp = 0x1bcf2b364430 ---
KDB: enter: panic

#10 0xffffffff80b76e3b in vpanic (
  fmt=0xffffffff8126b809 "Assertion %s failed at %s:%d",
  ap=ap@entry=0xfffffe00d1eb0408) at /usr/src/sys/kern/kern_shutdown.c:967
      buf = "Assertion td->td_lkpi_task == NULL failed at /usr/src/sys/compat/linuxkpi/common/src/linux_current.c:85", '\000' <repeats 152 times>
      __pc = 0x0
      __pc = 0x0
      __pc = 0x0
      other_cpus = {__bits = {4031, 0 <repeats 15 times>}}
      td = 0xfffff80006fac000
      bootopt = <optimized out>
      newpanic = <optimized out>
#11 0xffffffff80b76ca3 in panic (
  fmt=0xffffffff81d9d450 <cnputs_mtx> "\311\352\031\201\377\377\377\377")
  at /usr/src/sys/kern/kern_shutdown.c:892
      ap = {{gp_offset = 32, fp_offset = 48,
          overflow_arg_area = 0xfffffe00d1eb0438,
          reg_save_area = 0xfffffe00d1eb03d8}}
#12 0xffffffff80e287d4 in linux_alloc_current (td=<optimized out>,
  flags=<optimized out>)
  at /usr/src/sys/compat/linuxkpi/common/src/linux_current.c:85
      ts = <optimized out>
      mm = <optimized out>
      proc = <optimized out>
      mm_other = <optimized out>
#13 0xffffffff84428787 in dc_assert_fp_enabled () from /boot/modules/amdgpu.ko
No symbol table info available.
#14 0xffffffff845d1a7b in dcn31_update_bw_bounding_box ()
 from /boot/modules/amdgpu.ko
No symbol table info available.
#15 0xffffffff8444e0ba in dc_create () from /boot/modules/amdgpu.ko
No symbol table info available.
#16 0xffffffff8441c1c7 in dm_hw_init () from /boot/modules/amdgpu.ko
No symbol table info available.
#17 0xffffffff8421af6a in amdgpu_device_ip_hw_init_phase2 ()
 from /boot/modules/amdgpu.ko
No symbol table info available.
#18 0x000000000000000a in ?? ()
No symbol table info available.
#19 0xfffffe012912a000 in ?? ()
No symbol table info available.
#20 0xfffffe0129179029 in ?? ()
No symbol table info available.
#21 0x000000000000000a in ?? ()
No symbol table info available.
#22 0xfffffe00d1eb07a0 in ?? ()
No symbol table info available.
#23 0xffffffff84216a1d in amdgpu_device_ip_init () from /boot/modules/amdgpu.ko
No symbol table info available.
Backtrace stopped: frame did not save the PC
```
Comment 2 Axel.Rau 2025-06-07 20:54:26 UTC
Created attachment 261073 [details]
dmesg
Comment 3 rkoberman 2025-06-09 03:27:07 UTC
99976934274d works fine. 2542189532b3 panics. This assumes that the issue is in the base system (lkpi). There have been no commits to the port since mid-May.
Note that I am running on an Intel GPU (i915-66-kmod), so it appears not to be GPU specific. I think it is on the lkpi side.

If I have enough time, I'll bisect. I can only hope that it is kernel as I can bisect much more quickly.
Comment 4 bkidney@briankidney.ca 2025-06-10 01:23:16 UTC
I can confirm this also confirm this happens with drm-61-kmod using the pkgbase update from June 8th.

---

Jun  8 20:32:37 river kernel: panic: Assertion td->td_lkpi_task == NULL failed at /home/pkgbuild/worktrees/main/sys/compat/linuxkpi/common/src/linux_current.c:85
Jun  8 20:32:37 river kernel: cpuid = 2
Jun  8 20:32:37 river kernel: time = 1749423665
Jun  8 20:32:37 river kernel: KDB: stack backtrace:
Jun  8 20:32:37 river kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0107ab04b0
Jun  8 20:32:37 river kernel: vpanic() at vpanic+0x136/frame 0xfffffe0107ab05e0
Jun  8 20:32:37 river kernel: panic() at panic+0x43/frame 0xfffffe0107ab0640
Jun  8 20:32:37 river kernel: linux_alloc_current() at linux_alloc_current+0x314/frame 0xfffffe0107ab06a0
Jun  8 20:32:37 river kernel: drm_framebuffer_init() at drm_framebuffer_init+0x191/frame 0xfffffe0107ab06d0
Jun  8 20:32:37 river kernel: intel_framebuffer_init() at intel_framebuffer_init+0x78a/frame 0xfffffe0107ab0720
Jun  8 20:32:37 river kernel: intel_crtc_initial_plane_config() at intel_crtc_initial_plane_config+0x75e/frame 0xfffffe0107ab0840
Jun  8 20:32:37 river kernel: intel_display_driver_probe_nogem() at intel_display_driver_probe_nogem+0x2ae/frame 0xfffffe0107ab0880
Jun  8 20:32:37 river kernel: i915_driver_probe() at i915_driver_probe+0x634/frame 0xfffffe0107ab08c0
Jun  8 20:32:37 river kernel: linux_pci_attach_device() at linux_pci_attach_device+0x440/frame 0xfffffe0107ab0910
Jun  8 20:32:37 river kernel: device_attach() at device_attach+0x45b/frame 0xfffffe0107ab0960
Jun  8 20:32:37 river kernel: bus_generic_driver_added() at bus_generic_driver_added+0x90/frame 0xfffffe0107ab0980
Jun  8 20:32:37 river kernel: devclass_driver_added() at devclass_driver_added+0x29/frame 0xfffffe0107ab09b0
Jun  8 20:32:37 river kernel: devclass_add_driver() at devclass_add_driver+0x138/frame 0xfffffe0107ab09f0
Jun  8 20:32:37 river kernel: _linux_pci_register_driver() at _linux_pci_register_driver+0xc1/frame 0xfffffe0107ab0a20
Jun  8 20:32:37 river kernel: i915kms_evh() at i915kms_evh+0x279/frame 0xfffffe0107ab0a50
Jun  8 20:32:37 river kernel: module_register_init() at module_register_init+0xb0/frame 0xfffffe0107ab0a80
Jun  8 20:32:37 river kernel: linker_load_module() at linker_load_module+0xc51/frame 0xfffffe0107ab0d80
Jun  8 20:32:37 river kernel: kern_kldload() at kern_kldload+0x16e/frame 0xfffffe0107ab0dd0
Jun  8 20:32:37 river kernel: sys_kldload() at sys_kldload+0x59/frame 0xfffffe0107ab0e00
Jun  8 20:32:37 river kernel: amd64_syscall() at amd64_syscall+0x169/frame 0xfffffe0107ab0f30
Jun  8 20:32:37 river kernel: fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0107ab0f30
Jun  8 20:32:37 river kernel: --- syscall (304, FreeBSD ELF64, kldload), rip = 0x36fa0d298baa, rsp = 0x36fa0be28cd8, rbp = 0x36fa0be29250 ---
Jun  8 20:32:37 river kernel: KDB: enter: panic
Comment 5 rkoberman 2025-06-15 20:47:52 UTC
(In reply to rkoberman from comment #3)
Please ignore my messages in this report. My issue, while looking similar, was not related to this report and was entirely PEBKAC. Sorry for any confusion I may have added.
Comment 6 bkidney@briankidney.ca 2025-06-16 12:08:58 UTC
(In reply to bkidney@briankidney.ca from comment #4)

The most recent version of PkgBase resolved this bug for me. The driver no longer panics, it just does not load until a new version is compiled against the new kernel.