FreeBSD/amd64 panics when trying to run glxgears(1) from amd64 Ubuntu Focal when running under native Wayland and i915. It's fairy (9/10 tries) reproducible. This doesn't happen when I replace Wayland with traditional Xorg; it also doesn't happen with glxgears from ports. Backtrace looks like this: drmn0: [drm] Resetting rcs0 for CS error drmn0: [drm] Xwayland[101131] context reset due to GPU hang drmn0: [drm] GPU HANG: ecode 9:1:bcff835b, in Xwayland [101131] drmn0: [drm] Resetting rcs0 for CS error drmn0: [drm] Xwayland[101131] context reset due to GPU hang drmn0: [drm] GPU HANG: ecode 9:1:bd5699c3, in Xwayland [101131] Kernel page fault with the following non-sleepable locks held: exclusive rw kernel vm object (kernel vm object) r = 0 (0xffffffff81ad1c48) locked @ /usr/home/trasz/git/freebsd-src/sys/vm/vm_kern.c:605 stack backtrace: #0 0xffffffff80bc29c5 at witness_debugger+0x65 #1 0xffffffff80bc3af9 at witness_warn+0x3e9 #2 0xffffffff8104ed48 at trap_pfault+0x88 #3 0xffffffff81021358 at calltrap+0x8 #4 0xffffffff80eeb9dd at kmem_free+0x2d #5 0xffffffff8376b05c at __i915_gpu_coredump_free+0xfc #6 0xffffffff83715d9b at execlists_capture_work+0xab #7 0xffffffff80df1e03 at linux_work_fn+0xe3 #8 0xffffffff80bb497b at taskqueue_run_locked+0xab #9 0xffffffff80bb5a33 at taskqueue_thread_loop+0xd3 #10 0xffffffff80b04f02 at fork_exit+0x82 #11 0xffffffff810223be at fork_trampoline+0xe Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 03 fault virtual address = 0x61 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80eeb858 stack pointer = 0x28:0xfffffe00c612fcf0 frame pointer = 0x28:0xfffffe00c612fd20 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (linuxkpi_short_wq_4) rdi: ffffffff81ad1ca0 rsi: 000fffffa00be3d2 rdx: 00000000fffffa00 rcx: ffffffffffffffd8 r8: 00000000ffffffff r9: 0000000000000000 rax: 0000000000000000 rbx: 0000000000001000 rbp: fffffe00c612fd20 r10: fffff8042e576c00 r11: 0000000000010000 r12: fffff80264e25a00 r13: fffffa00be3d2000 r14: 0000000000000000 r15: fffff80009501c00 trap number = 12 panic: page fault cpuid = 3 time = 1697302088 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00c612f9c0 vpanic() at vpanic+0x132/frame 0xfffffe00c612faf0 panic() at panic+0x43/frame 0xfffffe00c612fb50 trap_fatal() at trap_fatal+0x40c/frame 0xfffffe00c612fbb0 trap_pfault() at trap_pfault+0xae/frame 0xfffffe00c612fc20 calltrap() at calltrap+0x8/frame 0xfffffe00c612fc20 --- trap 0xc, rip = 0xffffffff80eeb858, rsp = 0xfffffe00c612fcf0, rbp = 0xfffffe00c612fd20 --- _kmem_unback() at _kmem_unback+0x78/frame 0xfffffe00c612fd20 kmem_free() at kmem_free+0x2d/frame 0xfffffe00c612fd40 __i915_gpu_coredump_free() at __i915_gpu_coredump_free+0xfc/frame 0xfffffe00c612fd80 execlists_capture_work() at execlists_capture_work+0xab/frame 0xfffffe00c612fdf0 linux_work_fn() at linux_work_fn+0xe3/frame 0xfffffe00c612fe40 taskqueue_run_locked() at taskqueue_run_locked+0xab/frame 0xfffffe00c612fec0 taskqueue_thread_loop() at taskqueue_thread_loop+0xd3/frame 0xfffffe00c612fef0 fork_exit() at fork_exit+0x82/frame 0xfffffe00c612ff30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00c612ff30 --- trap 0xbe8d4339, rip = 0x8f1e1dc67eb85dfb, rsp = 0x64f5f62d9553b610, rbp = 0x1ba64ffe8da156e ---
Could you please see if this patch helps? https://reviews.freebsd.org/D40028 It won't fix the underlying problem which is triggering a GPU reset, but at least the panic should be gone.
It does fix the panic, thank you :) Any chance to get this in before 14.0?
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=6223d0b67af923f53d962a9bf594dc37004dffe8 commit 6223d0b67af923f53d962a9bf594dc37004dffe8 Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2023-10-17 14:26:18 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2023-10-17 15:19:06 +0000 linuxkpi: Handle direct-mapped addresses in linux_free_kmem() See the analysis in PR 271333. It is possible for driver code to allocate a page, store its address as returned by page_address(), then call free_page() on that address. On most systems that'll result in the LinuxKPI calling kmem_free() with a direct-mapped address, which is not legal. Fix the problem by making linux_free_kmem() check the address to see whether it's direct-mapped or not, and handling it appropriately. PR: 271333, 274515 Reviewed by: hselasky, bz Tested by: trasz MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D40028 sys/compat/linuxkpi/common/src/linux_page.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-)
A commit in branch stable/14 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=4862eb8604d503b52e7c3aa7ff32155b75a1ff93 commit 4862eb8604d503b52e7c3aa7ff32155b75a1ff93 Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2023-10-17 14:26:18 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2023-10-24 13:20:01 +0000 linuxkpi: Handle direct-mapped addresses in linux_free_kmem() See the analysis in PR 271333. It is possible for driver code to allocate a page, store its address as returned by page_address(), then call free_page() on that address. On most systems that'll result in the LinuxKPI calling kmem_free() with a direct-mapped address, which is not legal. Fix the problem by making linux_free_kmem() check the address to see whether it's direct-mapped or not, and handling it appropriately. PR: 271333, 274515 Reviewed by: hselasky, bz Tested by: trasz MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D40028 (cherry picked from commit 6223d0b67af923f53d962a9bf594dc37004dffe8) sys/compat/linuxkpi/common/src/linux_page.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-)
A commit in branch releng/14.0 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=87dbb943df73022dd98487c123aeb125da11c4af commit 87dbb943df73022dd98487c123aeb125da11c4af Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2023-10-17 14:26:18 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2023-10-25 16:53:01 +0000 linuxkpi: Handle direct-mapped addresses in linux_free_kmem() See the analysis in PR 271333. It is possible for driver code to allocate a page, store its address as returned by page_address(), then call free_page() on that address. On most systems that'll result in the LinuxKPI calling kmem_free() with a direct-mapped address, which is not legal. Fix the problem by making linux_free_kmem() check the address to see whether it's direct-mapped or not, and handling it appropriately. Approved by: re (gjb) PR: 271333, 274515 Reviewed by: hselasky, bz Tested by: trasz MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D40028 (cherry picked from commit 6223d0b67af923f53d962a9bf594dc37004dffe8) (cherry picked from commit 4862eb8604d503b52e7c3aa7ff32155b75a1ff93) sys/compat/linuxkpi/common/src/linux_page.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=6cba7aec21bcd957478a987f9391fd33a4babdac commit 6cba7aec21bcd957478a987f9391fd33a4babdac Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2023-10-17 14:26:18 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2024-01-09 17:59:49 +0000 linuxkpi: Handle direct-mapped addresses in linux_free_kmem() See the analysis in PR 271333. It is possible for driver code to allocate a page, store its address as returned by page_address(), then call free_page() on that address. On most systems that'll result in the LinuxKPI calling kmem_free() with a direct-mapped address, which is not legal. Fix the problem by making linux_free_kmem() check the address to see whether it's direct-mapped or not, and handling it appropriately. PR: 271333, 274515 Reviewed by: hselasky, bz Tested by: trasz MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D40028 (cherry picked from commit 6223d0b67af923f53d962a9bf594dc37004dffe8) sys/compat/linuxkpi/common/src/linux_page.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-)