Summary: | kernel panic on boot due to missing domain ivar on acpi0 | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Wes Maag <jwmaag> | ||||||
Component: | kern | Assignee: | Konstantin Belousov <kib> | ||||||
Status: | Closed FIXED | ||||||||
Severity: | Affects Only Me | CC: | avg, kib, markj | ||||||
Priority: | --- | Keywords: | regression | ||||||
Version: | CURRENT | ||||||||
Hardware: | amd64 | ||||||||
OS: | Any | ||||||||
Attachments: |
|
(In reply to Wes Maag from comment #0) Do you have a stack trace of the panic? I think that it should have been printed as well. I knew I was forgetting something :) transcribing from a photo I took: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b vpanic() at vpanic+0x19d panic() at panic+0x43 dmar_find() at dmar_find+0x4e6 iommu_alloc_msi_intr() at iommu_alloc_msi_intr+0x5c msi_alloc() at msi_alloc+0x1d4 amdvi_setup_hw() at amdvi_setup_hw+0x360 ivhd_attach() at ivhd_attach+0x5a8 photo can be found here: https://www.dropbox.com/s/ups01ephbauetzw/kernel-panic-backtrace.jpg?dl=0 Kostik, could you please take a look at this? The problem seems to be that dmar_find() assumes that the device must be on a pci bus but not all devices that can request an MSI actually are. ivhd in this case is an "artificial" device on acpi bus. It is created (via device_identify) based on a special ACPI table and there is no corresponding device in the ACPI namespace. (In reply to Andriy Gapon from comment #3) This is AMD machine, right ? Why is dmar_find() called at all ? It can only happen if user manually enabled dmar DMA or IR, which is nonsense on AMD machine. (In reply to Konstantin Belousov from comment #4) I guess I'm not 100% sure on how I would have enabled it... This is a plain GENERIC kernel, and nothing regarding busdma or dmar (assuming those are relevant) in my loader or sysctl.conf Any ideas on where I should look? or anything else I should provide? Created attachment 205772 [details]
Filter out non-pci devices in dmar_find().
Try this patch. I use the opportunity to fix some hole in dmar_find(). The additional fix (that would block testing this change) would eliminate the execution of body for dmar_find() at all).
(In reply to Konstantin Belousov from comment #6) Thanks, That patch worked for me. A commit references this bug: Author: kib Date: Sun Jul 14 21:08:54 UTC 2019 New revision: 349988 URL: https://svnweb.freebsd.org/changeset/base/349988 Log: PR: 239143 Reported and tested by: Wes Maag <jwmaag@gmail.com> Sponsored by: The FreeBSD Foundation MFC after: 1 week Changes: head/sys/x86/iommu/intel_drv.c A commit references this bug: Author: kib Date: Sun Jul 21 08:28:29 UTC 2019 New revision: 350192 URL: https://svnweb.freebsd.org/changeset/base/350192 Log: MFC r349988: In dmar_find(), refuse to search for DMAR unit for non-PCI device. PR: 239143 Changes: _U stable/12/ stable/12/sys/x86/iommu/intel_drv.c |
Created attachment 205693 [details] acpidump It seems as though my computer is tripping the KASSERTs introduced in commit 349571 A working system prints the following errors. ivhd0: <AMD-Vi/IOMMU ivhd with EFR> on acpi0 ivhd0: Flag:b0<IotlbSup,Coherent> ivhd0: Features(type:0x11) MsiNumPPR = 0 PNBanks= 2 PNCounters= 0 ivhd0: Extended features[31:0]:22294ada<PPRSup,NXSup,GTSup,IASup> HATS = 0x2 GATS = 0x0 GLXSup = 0x1 SmiFSup = 0x1 SmiFRC = 0x2 GAMSup = 0x1 DualPortLogSup = 0x2 DualEventLogSup = 0x2 ivhd0: Extended features[62:32]:f77ef<USSup> Max PASID: 0x2f DevTblSegSup = 0x3 MarcSup = 0x1 ivhd0: supported paging level:7, will use only: 4 ivhd0: device range: 0x0 - 0xffff ivhd0: PCI cap 0x190b640f@0x40 feature:19<IOTLB,EFR,CapExt> ivhd0: failed to read ivar PCI_IVAR_DOMAIN on bus acpi0, error = 2 acpi0: failed to read ivar PCIB_IVAR_BUS on bus nexus0, error = 6 ivhd0: failed to read ivar PCI_IVAR_DOMAIN on bus acpi0, error = 2 acpi0: failed to read ivar PCIB_IVAR_BUS on bus nexus0, error = 6 The panic on the failing kernel panic: pci_get_domain failed for ivhd0 on bus acpi0, error = 2 cpuid = 2 time = 1