I'm on FreeBSD 13.1-RELEASE-p5 on amd64. My kernel configuration is: include GENERIC options DDB # required to enable dumps options KDB_UNATTENDED # required to enable dumps options INVARIANTS options INVARIANT_SUPPORT nodevice em nodevice ixl nodevice iavf nodevice ice nodevice ix I have two Dell R750, one with Intel X722 adapter and another with Intel X710 (both use ixl driver on FreeBSD). After unloading the driver, I'm trying to passthrough them to the VM. /boot/loader.conf: vmm_load="YES" pptdevs="179/0/0" pciconf -lv: ppt0@pci0:179:0:0: class=0x020000 rev=0x04 hdr=0x00 vendor=0x8086 device=0x37d0 subvendor=0x8086 subdevice=0x0002 vendor = 'Intel Corporation' device = 'Ethernet Connection X722 for 10GbE SFP+' class = network subclass = ethernet I can start VM just fine (with FreeBSD 12.4-RELEASE), however, when I try starting with the passthrough, I get the following panic on the host: panic: vtd_add_device: device 0 is not in scope for any DMA remapping unit Backtrace is: __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55 55 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu, (kgdb) #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55 #1 doadump (textdump=textdump@entry=0) at /usr/src/sys/kern/kern_shutdown.c:399 #2 0xffffffff804b131a in db_dump (dummy=<optimized out>, dummy2=<unavailable>, dummy3=<unavailable>, dummy4=<unavailable>) at /usr/src/sys/ddb/db_command.c:575 #3 0xffffffff804b11d2 in db_command (last_cmdp=<optimized out>, cmd_table=<optimized out>, dopager=dopager@entry=1) at /usr/src/sys/ddb/db_command.c:482 #4 0xffffffff804b0e2d in db_command_loop () at /usr/src/sys/ddb/db_command.c:535 #5 0xffffffff804b42a6 in db_trap (type=<optimized out>, code=<optimized out>) at /usr/src/sys/ddb/db_main.c:270 #6 0xffffffff80c20c56 in kdb_trap (type=type@entry=3, code=code@entry=0, tf=tf@entry=0xfffffe019a333820) at /usr/src/sys/kern/subr_kdb.c:733 #7 0xffffffff810997b9 in trap (frame=0xfffffe019a333820) at /usr/src/sys/amd64/amd64/trap.c:607 #8 <signal handler called> #9 kdb_enter (why=0xffffffff8122185d "panic", msg=<optimized out>) at /usr/src/sys/kern/subr_kdb.c:506 #10 0xffffffff80bd37a0 in vpanic ( fmt=0xffffffff8215136a "vtd_add_device: device %x is not in scope for any DMA remapping unit", ap=ap@entry=0xfffffe019a333980) at /usr/src/sys/kern/kern_shutdown.c:908 #11 0xffffffff80bd3533 in panic ( fmt=0xffffffff8190aa70 <lock_class_mtx_spin> "\310\320\022\201\377\377\377\377\n") at /usr/src/sys/kern/kern_shutdown.c:844 #12 0xffffffff82147bc9 in vtd_add_device (arg=<optimized out>, rid=0) at /usr/src/sys/amd64/vmm/intel/vtd.c:461 #13 0xffffffff821350fe in IOMMU_ADD_DEVICE (domain=<optimized out>, rid=128) at /usr/src/sys/amd64/vmm/io/iommu.c:125 #14 iommu_add_device (dom=<optimized out>, rid=128) at /usr/src/sys/amd64/vmm/io/iommu.c:327 #15 iommu_init () at /usr/src/sys/amd64/vmm/io/iommu.c:238 #16 iommu_create_domain (maxaddr=268435456) at /usr/src/sys/amd64/vmm/io/iommu.c:271 #17 0xffffffff8212a927 in vm_assign_pptdev (vm=0xfffffe006e7eb000, bus=177, slot=0, func=0) at /usr/src/sys/amd64/vmm/vmm.c:988 #18 0xffffffff8212f1aa in vmmdev_ioctl (cdev=<optimized out>, cmd=2148300328, data=0xfffffe019a333d50 "\261", fflag=<optimized out>, td=<optimized out>) at /usr/src/sys/amd64/vmm/vmm_dev.c:545 #19 0xffffffff80a6a8cc in devfs_ioctl (ap=0xfffffe019a333ba8) at /usr/src/sys/fs/devfs/devfs_vnops.c:944 #20 0xffffffff80cd0041 in vn_ioctl (fp=0xfffff8000b47e2d0, com=<optimized out>, data=0xfffffe019a333d50, active_cred=0xfffff8000bbf6d00, td=0x0) at /usr/src/sys/kern/vfs_vnops.c:1696 #21 0xffffffff80a6afae in devfs_ioctl_f ( fp=0xffffffff8190aa70 <lock_class_mtx_spin>, com=128, data=0xffffffff811ed308, cred=0x1, td=0xfffffe01652bf720) at /usr/src/sys/fs/devfs/devfs_vnops.c:875 #22 0xffffffff80c44242 in fo_ioctl (fp=<optimized out>, com=2148300328, data=0x20, active_cred=0x1, td=0xfffffe01652bf720) at /usr/src/sys/sys/file.h:361 #23 kern_ioctl (td=<optimized out>, td@entry=0xfffffe01652bf720, fd=<optimized out>, com=com@entry=2148300328, data=0x20 <error: Cannot access memory at address 0x20>, data@entry=0xfffffe019a333d50 "\261") at /usr/src/sys/kern/sys_generic.c:803 #24 0xffffffff80c43f96 in sys_ioctl (td=0xfffffe01652bf720, uap=0xfffffe01652bfb08) at /usr/src/sys/kern/sys_generic.c:711 #25 0xffffffff8109a53e in syscallenter (td=<optimized out>) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:189 #26 amd64_syscall (td=0xfffffe01652bf720, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1185 #27 <signal handler called> #28 0x0000000801619c2a in ?? ()
Dell R750 would mean that the processor(s) is/are Ice-Lake SP, right?
(In reply to Eric Joyner from comment #1) The server has dual Intel(R) Xeon(R) Silver 4310 CPU @ 2.10GHz, which, per https://www.intel.com/content/www/us/en/products/sku/215277/intel-xeon-silver-4310-processor-18m-cache-2-10-ghz/specifications.html is Ice Lake. In Wikipedia (https://en.wikipedia.org/wiki/List_of_Intel_Xeon_processors_(Ice_Lake-based)#Xeon_Silver_4310) it's listed as "Ice Lake-SP" (10 nm) Scalable Performance
Ok; I'm seeing the same problem on the dual processor Ice Lake-SP system that I have, and I think I've found a temporary fix: Go to /usr/src/sys/amd64/vmm/intel/vtd.c and change DRHD_MAX_UNITS from 8 to 10. You can verify that your system probably has more DRHDs than what FreeBSD will look at by doing "acpidump -dt" and checking the DMAR section and counting the number of DRHD sections you find under there. My system has exactly 10, and since FreeBSD was missing two of the sections, some device scopes were missing. So, when FreeBSD decided to add all of the devices in the system on iommu initialization, there existed devices with no stored scopes, so the code there called panic().
OK, will test tomorrow.
That works fine. Is it a valid workaround to be committed to main and MFC to stable/13? It would be nice if 13.2-RELEASE could have it without patching.
(In reply to Piotr Kubaj from comment #5) I think I'd want feedback from someone who might know more about how safe changing that number would be (jhb? kib?), but maybe a safe thing to do would be to just double the number to 16. Ideally we'd try this out on a Sapphire Rapids platform, too, to see if even 16 is enough for those.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=53545967642d850eee4f2dd9fa27cae52ae981b9 commit 53545967642d850eee4f2dd9fa27cae52ae981b9 Author: Eric Joyner <erj@FreeBSD.org> AuthorDate: 2023-01-30 21:34:03 +0000 Commit: Eric Joyner <erj@FreeBSD.org> CommitDate: 2023-01-31 21:57:42 +0000 vtd: Increase DRHD_MAX_UNITS Observed on a couple Ice Lake-SP platforms (Intel Coyote Pass, Dell R750), there are more than 8 DRHD sections enumerated in the DMAR ACPI section. Since the previous limit was 8, this resulted in some of these not being parsed by vtd when the iommu is initialized; in this case when PCI devices are being passthru'd to a bhyve VM. This omission later causes a kernel panic later in initialization when devices could not be found in a valid DRHD scope because the DHRD containing the device's scope was not added to vtd. Signed-off-by: Eric Joyner <erj@FreeBSD.org> PR: 268486 Sponsored by: Intel Corporation Reviewed by: rew@, corvink@ MFC after: 1 day Differential Revision: https://reviews.freebsd.org/D38285 sys/amd64/vmm/intel/vtd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=7aaa7dad32ca350e972d7c4688e39c50d818b45b commit 7aaa7dad32ca350e972d7c4688e39c50d818b45b Author: Eric Joyner <erj@FreeBSD.org> AuthorDate: 2023-01-30 21:34:03 +0000 Commit: Eric Joyner <erj@FreeBSD.org> CommitDate: 2023-02-06 22:48:19 +0000 vtd: Increase DRHD_MAX_UNITS Observed on a couple Ice Lake-SP platforms (Intel Coyote Pass, Dell R750), there are more than 8 DRHD sections enumerated in the DMAR ACPI section. Since the previous limit was 8, this resulted in some of these not being parsed by vtd when the iommu is initialized; in this case when PCI devices are being passthru'd to a bhyve VM. This omission later causes a kernel panic later in initialization when devices could not be found in a valid DRHD scope because the DHRD containing the device's scope was not added to vtd. Signed-off-by: Eric Joyner <erj@FreeBSD.org> PR: 268486 Sponsored by: Intel Corporation Reviewed by: rew@, corvink@ Differential Revision: https://reviews.freebsd.org/D38285 (cherry picked from commit 53545967642d850eee4f2dd9fa27cae52ae981b9) sys/amd64/vmm/intel/vtd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=b4e0be6df772a3bd5f235e2a840ff4fdbe57e2d1 commit b4e0be6df772a3bd5f235e2a840ff4fdbe57e2d1 Author: Eric Joyner <erj@FreeBSD.org> AuthorDate: 2023-01-30 21:34:03 +0000 Commit: Eric Joyner <erj@FreeBSD.org> CommitDate: 2023-02-06 22:52:10 +0000 vtd: Increase DRHD_MAX_UNITS Observed on a couple Ice Lake-SP platforms (Intel Coyote Pass, Dell R750), there are more than 8 DRHD sections enumerated in the DMAR ACPI section. Since the previous limit was 8, this resulted in some of these not being parsed by vtd when the iommu is initialized; in this case when PCI devices are being passthru'd to a bhyve VM. This omission later causes a kernel panic later in initialization when devices could not be found in a valid DRHD scope because the DHRD containing the device's scope was not added to vtd. Signed-off-by: Eric Joyner <erj@FreeBSD.org> PR: 268486 Sponsored by: Intel Corporation Reviewed by: rew@, corvink@ Differential Revision: https://reviews.freebsd.org/D38285 (cherry picked from commit 53545967642d850eee4f2dd9fa27cae52ae981b9) sys/amd64/vmm/intel/vtd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)