264305 – Crash on first boot

Bug 264305 - Crash on first boot

Summary: Crash on first boot

Status:	Closed DUPLICATE of bug 268393

Alias:	None

Product:	Base System
Classification:	Unclassified
Component:	kern (show other bugs)
Version:	13.1-STABLE
Hardware:	amd64 Any

Importance:	--- Affects Some People
Assignee:	Mark Linimon

URL:
Keywords:	crash

Depends on:
Blocks:

Reported:	2022-05-28 04:42 UTC by Ivan Rozhuk
Modified:	2023-09-27 12:50 UTC (History)
CC List:	8 users (show)

See Also:	272878 268393

Attachments
core.txt (64.57 KB, text/plain) 2022-05-28 04:42 UTC, Ivan Rozhuk	no flags	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Ivan Rozhuk 2022-05-28 04:42:12 UTC

Created attachment 234281 [details]
core.txt

Happen at least on 5750G, fresh 13.1 amd64, drm510-kmod.
On second boot after power on - no panic.


I suppose that this = NULL:
rirb_base = (struct hdac_rirb *)sc->rirb_dma.dma_vaddr;

and it fail on:
		rirb = &rirb_base[sc->rirb_rp];
		resp = le32toh(rirb->response);



[4] hdac0: <ATI (0x1637) HDA Controller> mem 0xfca88000-0xfca8bfff at device 0.1 on pci10
[4] hdac1: <AMD Raven HDA Controller> mem 0xfca80000-0xfca87fff at device 0.6 on pci10
[4]
[4]
[4] Fatal trap 12: page fault while in kernel mode
[4] cpuid = 0; apic id = 00
[4] fault virtual address       = 0x8
[4] fault code          = supervisor read data, page not present
[4] instruction pointer = 0x20:0xffffffff81e8503a
[4] stack pointer               = 0x28:0xfffffe01561e4e10
[4] frame pointer               = 0x28:0xfffffe01561e4e60
[4] code segment                = base rx0, limit 0xfffff, type 0x1b
[4]                     = DPL 0, pres 1, long 1, def32 0, gran 1
[4] processor eflags    = interrupt enabled, resume, IOPL = 0
[4] current process             = 11 (irq78: hdac1)
[4] trap number         = 12
[4] panic: page fault
[4] cpuid = 0
[4] time = 1653708921
[4] KDB: stack backtrace:
[4] #0 0xffffffff8068420b at kdb_backtrace+0x6b
[4] #1 0xffffffff80639fdf at vpanic+0x17f
[4] #2 0xffffffff80639e53 at panic+0x43
[4] #3 0xffffffff80946f05 at trap_fatal+0x385
[4] #4 0xffffffff80946f5f at trap_pfault+0x4f
[4] #5 0xffffffff8091d52e at calltrap+0x8
[4] #6 0xffffffff80604c44 at ithread_loop+0x244
[4] #7 0xffffffff806019b7 at fork_exit+0x77
[4] #8 0xffffffff8091e5ae at fork_trampoline+0xe
[4] Uptime: 4s


__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:399
#2  0xffffffff80639bdb in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:487
#3  0xffffffff8063a04e in vpanic (fmt=0xffffffff8098b054 "%s",
    ap=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:920
#4  0xffffffff80639e53 in panic (fmt=<unavailable>)
    at /usr/src/sys/kern/kern_shutdown.c:844
#5  0xffffffff80946f05 in trap_fatal (frame=0xfffffe01561e4d50, eva=8)
    at /usr/src/sys/amd64/amd64/trap.c:944
#6  0xffffffff80946f5f in trap_pfault (frame=0xfffffe01561e4d50,
    usermode=false, signo=<optimized out>, ucode=<optimized out>)
    at /usr/src/sys/amd64/amd64/trap.c:763
#7  <signal handler called>
#8  hdac_rirb_flush (sc=0xfffff80002221000)
    at /usr/src/sys/dev/sound/pci/hda/hdac.c:959
#9  hdac_one_intr (sc=0xfffff80002221000, intsts=3221225472)
    at /usr/src/sys/dev/sound/pci/hda/hdac.c:343
#10 hdac_intr_handler (context=0xfffff80002221000)
    at /usr/src/sys/dev/sound/pci/hda/hdac.c:390
#11 0xffffffff80604c44 in intr_event_execute_handlers (ie=0xfffff80002236100,
    p=<optimized out>) at /usr/src/sys/kern/kern_intr.c:1167
#12 ithread_execute_handlers (ie=<optimized out>, p=<optimized out>)
    at /usr/src/sys/kern/kern_intr.c:1180
#13 ithread_loop (arg=0xfffff80002220e80)
    at /usr/src/sys/kern/kern_intr.c:1268
#14 0xffffffff806019b7 in fork_exit (
    callout=0xffffffff80604a00 <ithread_loop>, arg=0xfffff80002220e80,
    frame=0xfffffe01561e4f40) at /usr/src/sys/kern/kern_fork.c:1093
#15 <signal handler called>
#16 mi_startup () at /usr/src/sys/kern/init_main.c:322
Backtrace stopped: Cannot access memory at address 0x120f008

Comment 1 Hans Petter Selasky freebsd_committer

2022-05-28 07:34:59 UTC

Alexander, maybe the IRQ resource is torn down in the wrong order and not drained?

--HPS

Comment 2 Hans Petter Selasky freebsd_committer

2022-05-28 07:37:22 UTC

I mean:

s/torn down/setup

Comment 3 Oleh Hushchenkov 2022-10-31 08:49:25 UTC

I have the same issue, 13-STABLE on laptop Thinkpad T14 Gen 1 with Ryzen 7 PRO 4750U CPU.

Discovered this after updated kernel to 13-STABLE from Oct 27.
Previous kernel on 13-STABLE from Jun 10 works without this panic.

Only first cold boot leads to the panic.

Also if I bool the kernel from Jun 10 at first time and reboot to the kernel from Oct 27 touchpad does not work.

Comment 4 Oleh Hushchenkov 2022-10-31 08:50:48 UTC

Forgot to add, 14-CURRENT has the same issue.

Comment 5 Jonathan Vasquez 2023-07-03 02:34:55 UTC

Similar issue to the bug I reported late last year, linking for higher visibility.

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=268393

Comment 6 Alex Matei 2023-07-23 23:57:05 UTC

I just built my new desktop and I have the same issue.

cpu: AMD Ryzen 9 7900
motherboard: Gigabyte B650 Aorus Elite AX
       network chip: Realtek RTL 8125BG
       audio chip: Realtek ALC897
OS: FreeBSD 13.2

Comment 7 Mark Linimon freebsd_committer

2023-08-30 16:54:23 UTC

^Triage: RIP hps@.

Comment 8 Mark Linimon freebsd_committer

2023-08-30 17:09:28 UTC

^Triage: to submitter:

I'm sorry this was not addressed in a faster manner.

Please try the patch in the following to see if it works around the problem:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=268393#c48

Comment 9 Mark Johnston freebsd_committer

2023-09-27 12:50:19 UTC


*** This bug has been marked as a duplicate of bug 268393 ***