258691 – x11/nvidia-driver: Fatal trap 12: page fault while in kernel mode

Bug 258691 - x11/nvidia-driver: Fatal trap 12: page fault while in kernel mode

Summary: x11/nvidia-driver: Fatal trap 12: page fault while in kernel mode

Status:	Open

Alias:	None

Product:	Ports & Packages
Classification:	Unclassified
Component:	Individual Port(s) (show other bugs)
Version:	Latest
Hardware:	amd64 Any

Importance:	--- Affects Only Me
Assignee:	Alexey Dokuchaev

URL:
Keywords:	crash, needs-qa

Depends on:
Blocks:

Reported:	2021-09-23 07:21 UTC by iron.udjin
Modified:	2022-12-02 02:01 UTC (History)
CC List:	3 users (show)

See Also:	251015

Attachments
KERNEL-config (14.79 KB, text/plain) 2021-09-23 07:22 UTC, iron.udjin	no flags	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description iron.udjin 2021-09-23 07:21:42 UTC

OS: stable/13-n247369-0437d10e359e-dirty
nvidia-driver-470.63.01_1

THis panic occurs from time to time right after OS start.

Reading symbols from /usr/obj/usr/src/amd64.amd64/sys/IRON/kernel.full...

Unread portion of the kernel message buffer:
NVRM: Xid (PCI:0000:01:00): 8, pid=218, Channel 00000002


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address	= 0x0
fault code		= supervisor read data, page not present
instruction pointer	= 0x20:0xffffffff8314ac02
stack pointer	       = 0x28:0xfffffe00e1b445c0
frame pointer	       = 0x28:0xfffffe014b791c00
code segment		= base rx0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 11 (swi4: clock (0))
trap number		= 12
panic: page fault
cpuid = 0
time = 1632381100
KDB: stack backtrace:
#0 0xffffffff806a9055 at kdb_backtrace+0x65
#1 0xffffffff806642c7 at vpanic+0x187
#2 0xffffffff80664133 at panic+0x43
#3 0xffffffff80912dd7 at trap_fatal+0x387
#4 0xffffffff80912e2f at trap_pfault+0x4f
#5 0xffffffff8091250a at trap+0x26a
#6 0xffffffff808ec4ce at calltrap+0x8
Uptime: 40s
Dumping 1338 out of 32650 MB:..2%..11%..21%..32%..41%..51%..61%..71%..81%..91%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55		__asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) bt
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:399
#2  0xffffffff80663eef in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:486
#3  0xffffffff80664336 in vpanic (fmt=0xffffffff8094f606 "%s", ap=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:919
#4  0xffffffff80664133 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:843
#5  0xffffffff80912dd7 in trap_fatal (frame=0xfffffe00e1b44500, eva=0) at /usr/src/sys/amd64/amd64/trap.c:941
#6  0xffffffff80912e2f in trap_pfault (frame=frame@entry=0xfffffe00e1b44500, usermode=false, signo=<optimized out>, signo@entry=0x0, ucode=<optimized out>, ucode@entry=0x0) at /usr/src/sys/amd64/amd64/trap.c:760
#7  0xffffffff8091250a in trap (frame=0xfffffe00e1b44500) at /usr/src/sys/amd64/amd64/trap.c:438
#8  <signal handler called>
#9  0xffffffff8314ac02 in _nv011673rm () from /boot/modules/nvidia.ko
#10 0xfffffe014b791c10 in ?? ()
#11 0xffffffff8314ad79 in _nv011671rm () from /boot/modules/nvidia.ko
#12 0xfffffe014b791c10 in ?? ()
#13 0xfffffe014b791c28 in ?? ()
#14 0x0000000000000000 in ?? ()
(kgdb)

Comment 1 iron.udjin 2021-09-23 07:22:10 UTC

Created attachment 228133 [details]
KERNEL-config

Comment 2 Kubilay Kocak freebsd_committer

2021-09-24 00:20:51 UTC

Thank you for the report. If you are able to, please rebuild the nvidia driver with WITH_DEBUG, reproduce the crash and attach an updated backtrace

Comment 3 Alexey Dokuchaev freebsd_committer

2021-09-27 09:01:32 UTC

(In reply to iron.udjin from comment #0)
> ...
> #9  0xffffffff8314ac02 in _nv011673rm () from /boot/modules/nvidia.ko
> #10 0xfffffe014b791c10 in ?? ()
> #11 0xffffffff8314ad79 in _nv011671rm () from /boot/modules/nvidia.ko
> #12 0xfffffe014b791c10 in ?? ()
> #13 0xfffffe014b791c28 in ?? ()
> #14 0x0000000000000000 in ?? ()
Backtrace looks rather similar to the one from bug #251015.

> nvidia-driver-470.63.01_1
I'm wondering if nvidia-driver-470.74 also exhibits this problem?

(In reply to Kubilay Kocak from comment #2)
> please rebuild the nvidia driver with WITH_DEBUG, reproduce the crash and attach
> an updated backtrace
I'm afraid that won't do much as the crash happens in the closed-source, obfuscated portions of the driver (Resource Manager).

Comment 4 iron.udjin 2021-09-27 10:20:38 UTC

(In reply to Alexey Dokuchaev from comment #3)

>I'm wondering if nvidia-driver-470.74 also exhibits this problem?

I don't know. I've built nvidia-driver-470.74 with DEBUG enabled and didn't catch a panic yet.

Comment 5 Marcin Cieślak 2022-11-24 09:56:24 UTC

Do you have a full kernel core dump for this one?
I am having similar crashes. They happen when nvidia is releasing some resources (closing a browser window or shutting down X for example).

Comment 6 iron.udjin 2022-12-02 02:01:37 UTC

(In reply to Marcin Cieślak from comment #5)

No, unfortunatelly I no longer have a kernel core dump. The bug was reported a year ago. Last months I didn't hit the bug. Possibly it was fixed in the recent driver updates. I propose to close this bug report and in case I hit it again - I'll one a new one bug report.