|Summary:||kernel panic on boot due to missing domain ivar on acpi0|
|Product:||Base System||Reporter:||Wes Maag <jwmaag>|
|Component:||kern||Assignee:||freebsd-acpi mailing list <acpi>|
|Severity:||Affects Only Me||CC:||avg, kib|
Description Wes Maag 2019-07-11 14:08:10 UTC
Created attachment 205693 [details] acpidump It seems as though my computer is tripping the KASSERTs introduced in commit 349571 A working system prints the following errors. ivhd0: <AMD-Vi/IOMMU ivhd with EFR> on acpi0 ivhd0: Flag:b0<IotlbSup,Coherent> ivhd0: Features(type:0x11) MsiNumPPR = 0 PNBanks= 2 PNCounters= 0 ivhd0: Extended features[31:0]:22294ada<PPRSup,NXSup,GTSup,IASup> HATS = 0x2 GATS = 0x0 GLXSup = 0x1 SmiFSup = 0x1 SmiFRC = 0x2 GAMSup = 0x1 DualPortLogSup = 0x2 DualEventLogSup = 0x2 ivhd0: Extended features[62:32]:f77ef<USSup> Max PASID: 0x2f DevTblSegSup = 0x3 MarcSup = 0x1 ivhd0: supported paging level:7, will use only: 4 ivhd0: device range: 0x0 - 0xffff ivhd0: PCI cap 0x190b640f@0x40 feature:19<IOTLB,EFR,CapExt> ivhd0: failed to read ivar PCI_IVAR_DOMAIN on bus acpi0, error = 2 acpi0: failed to read ivar PCIB_IVAR_BUS on bus nexus0, error = 6 ivhd0: failed to read ivar PCI_IVAR_DOMAIN on bus acpi0, error = 2 acpi0: failed to read ivar PCIB_IVAR_BUS on bus nexus0, error = 6 The panic on the failing kernel panic: pci_get_domain failed for ivhd0 on bus acpi0, error = 2 cpuid = 2 time = 1
Comment 1 Andriy Gapon 2019-07-11 22:23:34 UTC
(In reply to Wes Maag from comment #0) Do you have a stack trace of the panic? I think that it should have been printed as well.
Comment 2 Wes Maag 2019-07-12 01:21:43 UTC
I knew I was forgetting something :) transcribing from a photo I took: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b vpanic() at vpanic+0x19d panic() at panic+0x43 dmar_find() at dmar_find+0x4e6 iommu_alloc_msi_intr() at iommu_alloc_msi_intr+0x5c msi_alloc() at msi_alloc+0x1d4 amdvi_setup_hw() at amdvi_setup_hw+0x360 ivhd_attach() at ivhd_attach+0x5a8 photo can be found here: https://www.dropbox.com/s/ups01ephbauetzw/kernel-panic-backtrace.jpg?dl=0
Comment 3 Andriy Gapon 2019-07-12 05:49:25 UTC
Kostik, could you please take a look at this? The problem seems to be that dmar_find() assumes that the device must be on a pci bus but not all devices that can request an MSI actually are. ivhd in this case is an "artificial" device on acpi bus. It is created (via device_identify) based on a special ACPI table and there is no corresponding device in the ACPI namespace.
Comment 4 Konstantin Belousov 2019-07-12 08:59:49 UTC
(In reply to Andriy Gapon from comment #3) This is AMD machine, right ? Why is dmar_find() called at all ? It can only happen if user manually enabled dmar DMA or IR, which is nonsense on AMD machine.
Comment 5 Wes Maag 2019-07-14 16:44:10 UTC
(In reply to Konstantin Belousov from comment #4) I guess I'm not 100% sure on how I would have enabled it... This is a plain GENERIC kernel, and nothing regarding busdma or dmar (assuming those are relevant) in my loader or sysctl.conf Any ideas on where I should look? or anything else I should provide?
Comment 6 Konstantin Belousov 2019-07-14 18:39:35 UTC
Created attachment 205772 [details] Filter out non-pci devices in dmar_find(). Try this patch. I use the opportunity to fix some hole in dmar_find(). The additional fix (that would block testing this change) would eliminate the execution of body for dmar_find() at all).
Comment 7 Wes Maag 2019-07-14 20:40:10 UTC
(In reply to Konstantin Belousov from comment #6) Thanks, That patch worked for me.
Comment 8 commit-hook 2019-07-14 21:09:39 UTC
A commit references this bug: Author: kib Date: Sun Jul 14 21:08:54 UTC 2019 New revision: 349988 URL: https://svnweb.freebsd.org/changeset/base/349988 Log: PR: 239143 Reported and tested by: Wes Maag <email@example.com> Sponsored by: The FreeBSD Foundation MFC after: 1 week Changes: head/sys/x86/iommu/intel_drv.c
Comment 9 commit-hook 2019-07-21 08:28:32 UTC
A commit references this bug: Author: kib Date: Sun Jul 21 08:28:29 UTC 2019 New revision: 350192 URL: https://svnweb.freebsd.org/changeset/base/350192 Log: MFC r349988: In dmar_find(), refuse to search for DMAR unit for non-PCI device. PR: 239143 Changes: _U stable/12/ stable/12/sys/x86/iommu/intel_drv.c