Bug 275172 - graphics/drm-kmod: crash with sysctl -a | less (radeonkms panic)
Summary: graphics/drm-kmod: crash with sysctl -a | less (radeonkms panic)
Status: New
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-ports-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-11-18 18:01 UTC by Martin Birgmeier
Modified: 2024-01-11 16:45 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Birgmeier 2023-11-18 18:01:18 UTC
Scenario:
- FreeBSD stable/14 at e4fb49e867ae (with a few harmless local patches -> 83a9fd62727b)
- Executing 'sysctl -a | less'

Result:
- The first page of the sysctl output is shown.

Scenario (continued):
- Press "G" to read all of the output and go to the end

Result (continued):
- Crash.

Note: Maybe there was a different scenario leading to the crash, the machine was also busy with iSCSI traffic (ctld) to zfs zvols. Subjetively, the crash coincided with pressing the "G" key.

Crash info:

[0]# cd /usr/lib/debug/boot/kernel        
[0]# kgdb kernel.debug /var/crash/vmcore.0
GNU gdb (GDB) 13.2 [GDB v13.2 for FreeBSD]
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from kernel.debug...

Unread portion of the kernel message buffer:


Fatal trap 9: general protection fault while in kernel mode
cpuid = 2; apic id = 02
instruction pointer     = 0x20:0xffffffff804f74c6
stack pointer           = 0x28:0xfffffe00f75e4a60
frame pointer           = 0x28:0xfffffe00f75e4b00
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 44121 (sysctl)
rdi: fffff8000c8ed088 rsi: 0000000000000000 rdx: 0000000000000000
rcx: 0000000000000000  r8: ffffffff82012090  r9: 0000000000000000
rax: 0000000000000000 rbx: fffff8000c8ed088 rbp: fffffe00f75e4b00
r10: 7fffac007fffac03 r11: fffffe0100f39540 r12: fffffe0100f39020
r13: 0000000000000000 r14: 0000000000000000 r15: 95f8404ead006a00
trap number             = 9
Timeout initializing vt_vga
panic: general protection fault
cpuid = 2
time = 1700327198
KDB: stack backtrace:
#0 0xffffffff8053772d at kdb_backtrace+0x5d
#1 0xffffffff804ec792 at vpanic+0x132
#2 0xffffffff804ec653 at panic+0x43
#3 0xffffffff8080a6fc at trap_fatal+0x40c
#4 0xffffffff807e2c98 at calltrap+0x8
#5 0xffffffff825d0ab4 at drm_vblank_info+0x64
#6 0xffffffff804fda40 at sysctl_root_handler_locked+0x90
#7 0xffffffff804fce66 at sysctl_root+0x216
#8 0xffffffff804fd4f6 at userland_sysctl+0x176
#9 0xffffffff804fd33c at sys___sysctl+0x5c
#10 0xffffffff8080afa5 at amd64_syscall+0x105
#11 0xffffffff807e35ab at fast_syscall_common+0xf8
Uptime: 8h43m24s
Dumping 2556 out of 16096 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

__curthread () at /net/hal/z/SRC/FreeBSD/src/MBi/stable/14/sys/amd64/include/pcpu_aux.h:57
57              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb)
Comment 1 Ed Maste freebsd_committer freebsd_triage 2023-11-18 18:27:35 UTC
drm_vblank_info comes from drm_sysctl_freebsd.c in drm-kmod. What version do you have installed?
Comment 2 Martin Birgmeier 2023-11-18 18:29:38 UTC
See below. To note that I am doing everything remotely on this machine, it only runs a bare X server plus an oclock (which changes once a minute).

[0]# pkg info drm-515-kmod-5.15.118_1 
drm-515-kmod-5.15.118_1
Name           : drm-515-kmod
Version        : 5.15.118_1
Installed on   : Sun Nov 12 12:59:17 2023 CET
Origin         : graphics/drm-515-kmod
Architecture   : FreeBSD:14:amd64
Prefix         : /usr/local
Categories     : kld graphics
Licenses       : MIT and GPLv2 and BSD2CLAUSE
Maintainer     : x11@FreeBSD.org
WWW            : https://github.com/freebsd/drm-kmod/
Comment        : DRM drivers modules
Annotations    :
        FreeBSD_version: 1400500
Flat size      : 14.4MiB
Description    :
amdgpu, i915, and radeon DRM drivers modules.
Currently corresponding to Linux 5.15 DRM.
This version is for FreeBSD 14.0 and above.
[0]#
Comment 3 Martin Birgmeier 2023-11-19 17:20:38 UTC
I just reproduced this with a simple "sysctl -a > /tmp/x1". The latter is "auto mfs".

After the machine had restarted, again with X and oclock running, I stopped the X server and re-issued the command, with the same results.

Finally, I completely disabled the starting of the X server + oclock (it was done by an rc.d start script). And surprise - again a crash!

[0]# cd /usr/lib/debug/boot/kernel        
[0]# kgdb kernel.debug /var/crash/vmcore.0
GNU gdb (GDB) 13.2 [GDB v13.2 for FreeBSD]
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from kernel.debug...

Unread portion of the kernel message buffer:


Fatal trap 9: general protection fault while in kernel mode
cpuid = 2; apic id = 02
instruction pointer     = 0x20:0xffffffff804f74c6
stack pointer           = 0x28:0xfffffe00f75e4a60
frame pointer           = 0x28:0xfffffe00f75e4b00
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 44121 (sysctl)
rdi: fffff8000c8ed088 rsi: 0000000000000000 rdx: 0000000000000000
rcx: 0000000000000000  r8: ffffffff82012090  r9: 0000000000000000
rax: 0000000000000000 rbx: fffff8000c8ed088 rbp: fffffe00f75e4b00
r10: 7fffac007fffac03 r11: fffffe0100f39540 r12: fffffe0100f39020
r13: 0000000000000000 r14: 0000000000000000 r15: 95f8404ead006a00
trap number             = 9
Timeout initializing vt_vga
panic: general protection fault
cpuid = 2
time = 1700327198
KDB: stack backtrace:
#0 0xffffffff8053772d at kdb_backtrace+0x5d
#1 0xffffffff804ec792 at vpanic+0x132
#2 0xffffffff804ec653 at panic+0x43
#3 0xffffffff8080a6fc at trap_fatal+0x40c
#4 0xffffffff807e2c98 at calltrap+0x8
#5 0xffffffff825d0ab4 at drm_vblank_info+0x64
#6 0xffffffff804fda40 at sysctl_root_handler_locked+0x90
#7 0xffffffff804fce66 at sysctl_root+0x216
#8 0xffffffff804fd4f6 at userland_sysctl+0x176
#9 0xffffffff804fd33c at sys___sysctl+0x5c
#10 0xffffffff8080afa5 at amd64_syscall+0x105
#11 0xffffffff807e35ab at fast_syscall_common+0xf8
Uptime: 8h43m24s
Dumping 2556 out of 16096 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

__curthread () at /net/hal/z/SRC/FreeBSD/src/MBi/stable/14/sys/amd64/include/pcpu_aux.h:57
57              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) 

However, the radeonkms module was still loaded, so I proceeded with removing the port: "pkg remove drm-515-kmod-5.15.118_1". After another reboot, this resulted in the following reduced set of modules loaded:

[1]# diff =(sed 's/.* //' <file with previous list of modules loaded> | sort) =(kldstat | sed 's/.* //' | sort)
17d16
< backlight.ko
21,22d19
< dmabuf.ko
< drm.ko
26d22
< firmware.ko
33,35d28
< iic.ko
< iicbb.ko
< iicbus.ko
39,41d31
< lindebugfs.ko
< linuxkpi.ko
< linuxkpi_hdmi.ko
43d32
< nullfs.ko
47d35
< radeonkms.ko
53d40
< ttm.ko
[1]# 

I again issued "sysctl -a > /tmp/x1", and this time there was no crash.

Summarizing, just loading the radeonkms driver can make the system crash.

-- Martin