Bug 239143

Summary: kernel panic on boot due to missing domain ivar on acpi0
Product: Base System Reporter: Wes Maag <jwmaag>
Component: kernAssignee: Konstantin Belousov <kib>
Status: Closed FIXED    
Severity: Affects Only Me CC: avg, kib, markj
Priority: --- Keywords: regression
Version: CURRENT   
Hardware: amd64   
OS: Any   
Attachments:
Description Flags
acpidump
none
Filter out non-pci devices in dmar_find(). none

Description Wes Maag 2019-07-11 14:08:10 UTC
Created attachment 205693 [details]
acpidump

It seems as though my computer is tripping the KASSERTs introduced in commit 349571

A working system prints the following errors.

ivhd0: <AMD-Vi/IOMMU ivhd with EFR> on acpi0
ivhd0: Flag:b0<IotlbSup,Coherent>
ivhd0: Features(type:0x11) MsiNumPPR = 0 PNBanks= 2 PNCounters= 0
ivhd0: Extended features[31:0]:22294ada<PPRSup,NXSup,GTSup,IASup> HATS = 0x2 GATS = 0x0 GLXSup = 0x1 SmiFSup = 0x1 SmiFRC = 0x2 GAMSup = 0x1 DualPortLogSup = 0x2 DualEventLogSup = 0x2
ivhd0: Extended features[62:32]:f77ef<USSup> Max PASID: 0x2f DevTblSegSup = 0x3 MarcSup = 0x1
ivhd0: supported paging level:7, will use only: 4
ivhd0: device range: 0x0 - 0xffff
ivhd0: PCI cap 0x190b640f@0x40 feature:19<IOTLB,EFR,CapExt>
ivhd0: failed to read ivar PCI_IVAR_DOMAIN on bus acpi0, error = 2
acpi0: failed to read ivar PCIB_IVAR_BUS on bus nexus0, error = 6
ivhd0: failed to read ivar PCI_IVAR_DOMAIN on bus acpi0, error = 2
acpi0: failed to read ivar PCIB_IVAR_BUS on bus nexus0, error = 6

The panic on the failing kernel

panic: pci_get_domain failed for ivhd0 on bus acpi0, error = 2
cpuid = 2
time = 1
Comment 1 Andriy Gapon freebsd_committer freebsd_triage 2019-07-11 22:23:34 UTC
(In reply to Wes Maag from comment #0)
Do you have a stack trace of the panic?
I think that it should have been printed as well.
Comment 2 Wes Maag 2019-07-12 01:21:43 UTC
I knew I was forgetting something :)

transcribing from a photo I took:

db_trace_self_wrapper() at db_trace_self_wrapper+0x2b
vpanic() at vpanic+0x19d
panic() at panic+0x43
dmar_find() at dmar_find+0x4e6
iommu_alloc_msi_intr() at iommu_alloc_msi_intr+0x5c
msi_alloc() at msi_alloc+0x1d4
amdvi_setup_hw() at amdvi_setup_hw+0x360
ivhd_attach() at ivhd_attach+0x5a8

photo can be found here:

https://www.dropbox.com/s/ups01ephbauetzw/kernel-panic-backtrace.jpg?dl=0
Comment 3 Andriy Gapon freebsd_committer freebsd_triage 2019-07-12 05:49:25 UTC
Kostik,
could you please take a look at this?
The problem seems to be that dmar_find() assumes that the device must be on a pci bus but not all devices that can request an MSI actually are.
ivhd in this case is an "artificial" device on acpi bus.  It is created (via device_identify) based on a special ACPI table and there is no corresponding device in the ACPI namespace.
Comment 4 Konstantin Belousov freebsd_committer freebsd_triage 2019-07-12 08:59:49 UTC
(In reply to Andriy Gapon from comment #3)
This is AMD machine, right ?  Why is dmar_find() called at all ?

It can only happen if user manually enabled dmar DMA or IR, which is nonsense on AMD machine.
Comment 5 Wes Maag 2019-07-14 16:44:10 UTC
(In reply to Konstantin Belousov from comment #4)

I guess I'm not 100% sure on how I would have enabled it...

This is a plain GENERIC kernel, and nothing regarding busdma or dmar (assuming those are relevant) in my loader or sysctl.conf

Any ideas on where I should look? or anything else I should provide?
Comment 6 Konstantin Belousov freebsd_committer freebsd_triage 2019-07-14 18:39:35 UTC
Created attachment 205772 [details]
Filter out non-pci devices in dmar_find().

Try this patch.  I use the opportunity to fix some hole in dmar_find().  The additional fix (that would block testing this change) would eliminate the execution of body for dmar_find() at all).
Comment 7 Wes Maag 2019-07-14 20:40:10 UTC
(In reply to Konstantin Belousov from comment #6)

Thanks, That patch worked for me.
Comment 8 commit-hook freebsd_committer freebsd_triage 2019-07-14 21:09:39 UTC
A commit references this bug:

Author: kib
Date: Sun Jul 14 21:08:54 UTC 2019
New revision: 349988
URL: https://svnweb.freebsd.org/changeset/base/349988

Log:
  PR:	239143
  Reported and tested by:	Wes Maag <jwmaag@gmail.com>
  Sponsored by:	The FreeBSD Foundation
  MFC after:	1 week

Changes:
  head/sys/x86/iommu/intel_drv.c
Comment 9 commit-hook freebsd_committer freebsd_triage 2019-07-21 08:28:32 UTC
A commit references this bug:

Author: kib
Date: Sun Jul 21 08:28:29 UTC 2019
New revision: 350192
URL: https://svnweb.freebsd.org/changeset/base/350192

Log:
  MFC r349988:
  In dmar_find(), refuse to search for DMAR unit for non-PCI device.

  PR:	239143

Changes:
_U  stable/12/
  stable/12/sys/x86/iommu/intel_drv.c