Bug 283103 - Yet another Parallels arm64 panic
Summary: Yet another Parallels arm64 panic
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-virtualization (Nobody)
URL:
Keywords: crash
Depends on:
Blocks:
 
Reported: 2024-12-03 17:11 UTC by Edward Tomasz Napierala
Modified: 2025-01-24 15:08 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Edward Tomasz Napierala freebsd_committer freebsd_triage 2024-12-03 17:11:16 UTC
This only happens every other week or so; I don't know what's the trigger, I believe the VM is idle when it happens.  Looks like this:

panic: sleepq_add: td 0xffff0000b661e640 to sleep on wchan 0xffffa000c0a0e380 with sleeping prohibited
cpuid = 0
time = 1733037443
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x38
vpanic() at vpanic+0x1ac
panic() at panic+0x48
sleepq_add() at sleepq_add+0x2c0
_sleep() at _sleep+0x1d4
AcpiOsAcquireMutex() at AcpiOsAcquireMutex+0x9c
AcpiUtAcquireMutex() at AcpiUtAcquireMutex+0xac
AcpiOsAcquireObject() at AcpiOsAcquireObject+0x1c
AcpiUtCreateGenericState() at AcpiUtCreateGenericState+0x14
AcpiPsPushScope() at AcpiPsPushScope+0x28
AcpiPsParseLoop() at AcpiPsParseLoop+0x300
AcpiPsParseAml() at AcpiPsParseAml+0xb0
AcpiPsExecuteMethod() at AcpiPsExecuteMethod+0x138
AcpiNsEvaluate() at AcpiNsEvaluate+0x1d8
AcpiEvaluateObject() at AcpiEvaluateObject+0x13c
ithread_loop() at ithread_loop+0x288
fork_exit() at fork_exit+0x78
fork_trampoline() at fork_trampoline+0x18
KDB: enter: panic

For some reason I can't do backtrace with kgdb(1), but I've seen the same problem previously, so probably unrelated:

(kgdb) bt
#0  0xffff0000004b68f4 in doadump (textdump=0) at /usr/home/trasz/git/freebsd/sys/kern/kern_shutdown.c:404
#1  0x2cb00000000e9bf0 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

I took a VM snapshot this time with kernel still in ddb(4).
Comment 1 Mark Johnston freebsd_committer freebsd_triage 2024-12-06 17:45:48 UTC
Do you know which driver is causing this?  That is, can you dump info about the current thread from ddb, so we know which ithread it is?
Comment 2 Edward Tomasz Napierala freebsd_committer freebsd_triage 2024-12-06 19:15:30 UTC
Yup, it's:

 * 79   Thread 100070 (PID=12: intr/gic0,s16: acpi_ged0)  0xffff0000004b5aa8 in doadump (textdump=0) at /usr/home/trasz/git/freebsd/sys/kern/kern_shutdown.c:404

I should have more luck with the backtrace next time this happens.
Comment 3 Andrew Turner freebsd_committer freebsd_triage 2024-12-09 11:45:09 UTC
The problem is the ged driver needs to call AcpiEvaluateObject from the ithread, however as taking the APCI lock can sleep if another thread already holds it the ithread can panic.

We could either make the ACPI lock spin, or allow some ithreads to sleep. I have a patch to allow some ithreads to sleep, however am unsure if this is the best option.
Comment 4 Edward Tomasz Napierala freebsd_committer freebsd_triage 2024-12-11 18:21:55 UTC
Can't we defer it to a separate thread?  There's the acpi_ged_defer sysctl; could it help?  If so, could we flip it to 1 by default and call it done?
Comment 5 Andrew Turner freebsd_committer freebsd_triage 2024-12-12 10:14:39 UTC
The sysctl was added in https://cgit.freebsd.org/src/commit/?id=be91b4797e2c to get the old behaviour, so that would be a regression.
Comment 6 Edward Tomasz Napierala freebsd_committer freebsd_triage 2024-12-15 16:48:40 UTC
Hm, you're right.

Okay, perhaps a weird idea, but can we make AcpiOsAcquireMutex() return error when it can't take the mutex without blocking?  ACPI seems to support it, eg AcpiOsAcquireObject() checks for that error return.
Comment 7 Andriy Gapon freebsd_committer freebsd_triage 2024-12-28 18:35:32 UTC
(In reply to Andrew Turner from comment #3)
I've been thinking about sleeping ithreads for another reason, sorry for hijacking this bug report.

There could be a chip connected, e.g., via I2C.
And that chip could also be capable of generating a signal (an interrupt) via a GPIO line.
It's not a problem to configure the GPIO interrupt but it may become a problem if the interrupt handler needs to access the chip via a bus that requires waiting / sleeping.

I think that at present the described setup can only work if we force the bus to spinning / polling but that may be sub-optimal.
Comment 8 Mark Johnston freebsd_committer freebsd_triage 2024-12-28 20:54:20 UTC
I don't see any particular reason an ithread must not sleep, so long as it is not shared with any other interrupt.  I think INTR_EXCL provides the desired semantics there?
Comment 9 Andrew Turner freebsd_committer freebsd_triage 2025-01-02 12:29:43 UTC
I've created https://reviews.freebsd.org/D48283 and https://reviews.freebsd.org/D48284 to allow sleepable ithreads, and use it in the GED driver.
Comment 10 commit-hook freebsd_committer freebsd_triage 2025-01-24 15:08:48 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=dbedcc169f70c924a680e02bc86d7419682a70ac

commit dbedcc169f70c924a680e02bc86d7419682a70ac
Author:     Andrew Turner <andrew@FreeBSD.org>
AuthorDate: 2025-01-13 05:37:52 +0000
Commit:     Andrew Turner <andrew@FreeBSD.org>
CommitDate: 2025-01-24 15:08:09 +0000

    acpi: Mark the GED ithread as sleepable

    We need to complete calling the ACPI method before marking the interrupt
    as complete. If two threads are inspecting the ACPI tables at the same
    time they may both try to lock the ACPI mutex causing one to sleep. If
    this is the ithread it will panic the kernel as this is not allowed.

    Update the ged ithread to allow sleeping as it is expected this lock
    will be uncommon enough any sleep will be short.

    PR:             283103
    Reviewed by:    markj (earlier version)
    Sponsored by:   Arm Ltd
    Differential Revision:  https://reviews.freebsd.org/D48284

 sys/dev/acpica/acpi_ged.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)