Bug 229852

Summary: bhyve: IOMMU (Intel VTd) PCI passthrough attempt locks up some systems
Product: Base System Reporter: Callum <callum>
Component: kernAssignee: Scott Long <scottl>
Status: Closed FIXED    
Severity: Affects Some People CC: allanjude, araujo, b, callum, dexter, emaste, felix, js, khng, kmachine, me, mgrooms, niels=freebsd, rgrimes, scottl, virtualization, yuripv
Priority: --- Keywords: IntelNetworking
Version: 11.2-RELEASEFlags: koobs: mfc-stable12+
Hardware: amd64   
OS: Any   
URL: https://reviews.freebsd.org/D19001
See Also: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246647
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211062
Bug Depends on:    
Bug Blocks: 246647    
Attachments:
Description Flags
Patch for VT-d capability detection on chipsets that have multiple translation units with differing capabilities none

Description Callum 2018-07-18 02:02:21 UTC
Created attachment 195225 [details]
Patch for VT-d capability detection on chipsets that have multiple translation units with differing capabilities

When an attempt is made to passthrough a PCI device to a bhyve VM (causing initialisation of IOMMU) on certain Intel chipsets using VT-d the PCI bus stops working entirely. This issue occurs on the E3-1275 v5 processor on C236 chipset and has also been encountered by others on the forums with different hardware in the Skylake series.

The chipset has two VT-d translation units. The issue is caused by an attempt to use the VT-d device-IOTLB capability that is supported by only the first unit for devices attached to the second unit which lacks that capability. Only the capabilities of the first unit are checked and are assumed to be the same for all units.

Attached is a patch to rectify this issue by determining which unit is responsible for the device being added to a domain and then checking that unit's device-IOTLB capability. In addition to this a few fixes have been made to other instances where the first unit's capabilities are assumed for all units for domains they share. In these cases a mutual set of capabilities is determined. The patch should hopefully fix any bugs for current/future hardware with multiple translation units supporting different capabilities.

A description is on the forums at https://forums.freebsd.org/threads/pci-passthrough-bhyve-usb-xhci.65235
The thread includes observations by other users of the bug occurring, and description as well as confirmation of the fix. I'd also like to thank Ordoban for their help.

The attached patch applies to 11.2-RELEASE and the current 11-STABLE. It will also apply to 12.0-CURRENT since the only difference in source at present is an extra 2 lines of licensing comment. Although I have personally only tested the patch on 11.2-RELEASE there's no reason results should differ on 12.0-CURRENT.
Comment 1 Niels Bakker 2018-09-21 16:42:41 UTC
I ran into this issue (on an Intel Celeron 3865U) with the exact symptoms described in the linked thread, and the patch resolved it for me as well.

I had tried the other workaround before - only pick devices on another PCI bus and IRQ line for passthru - but that did not help. Without this patch, any attempt to use a passthru device immediately crashes the whole computer by rendering all PCI devices like AHCI and USB controllers absent.
Comment 2 t_uemura 2018-10-09 12:25:47 UTC
I had the same issue on my Shuttle DS77U mini-PC (Intel Celeron 3865U;
Sunrise Point-LP chipset) and the patch fixes the issue perfectly. Both of
my host and guest run 11.2-STABLE as of 28th Sep..

Someone please make sure there's no side effect and commit/MFC.
Comment 3 Felix Hanley 2019-01-20 06:22:52 UTC
Attached patch fixes the hanging system for me running 12.0-RELEASE-p2 on a Kabylake series i7-8550U.
Comment 4 js 2019-01-24 00:27:29 UTC
Patch works great on 12.0 with Skylake i7-6820HQ.  Please commit and MFC.
Comment 5 Marcelo Araujo freebsd_committer freebsd_triage 2019-01-24 04:25:32 UTC
Thanks for the patch!!!

Could you guys share with me how did you test it? As an example:
1) bhyve command line
2) CPU Type
3) Guest OS USED
4) Device used via passthrough

Best,
Comment 6 Niels Bakker 2019-01-24 13:45:10 UTC
(In reply to Marcelo Araujo from comment #5)

> 1) bhyve command line
I'm not sure tbh - created and started it via vm-bhyve and it rewrites its cmdline.
Its config file contains these lines, plus others that deal with storage and vnet:
---
loader="bhyveload"
cpu=2
memory=4G
passthru0="0/31/6"
bhyve_options="-S"
---

> 2) CPU Type
CPU: Intel(R) Celeron(R) CPU 3865U @ 1.80GHz (1800.08-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x806e9  Family=0x6  Model=0x8e  Stepping=9

This is a Kaby Lake CPU (same class as 7th gen Core) from 2017.

> 3) Guest OS USED
guest# uname -srv
FreeBSD 11.2-RELEASE-p7 FreeBSD 11.2-RELEASE-p7 #0: Tue Dec 18 08:29:33 UTC 2018     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC 

> 4) Device used via passthrough
---
host# grep ^ppt /boot/loader.conf
pptdevs="0/31/6 2/0/0"

host# pciconf -lv ppt1@pci0:0:31:6
ppt1@pci0:0:31:6:	class=0x020000 card=0x00008086 chip=0x156f8086 rev=0x21 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Connection I219-LM'
    class      = network
    subclass   = ethernet
---
guest# pciconf -lv em0
em0@pci0:0:6:0:	class=0x020000 card=0x00008086 chip=0x156f8086 rev=0x21 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Connection I219-LM'
    class      = network
    subclass   = ethernet
---
(The second device, some WiFi chipset, isn't passed through to any VM, and there is no FreeBSD driver for it anyway)

As said, without the patch the system dies an immediate death as soon as the bhyve with passthrough is started.
Comment 7 Rodney W. Grimes freebsd_committer freebsd_triage 2019-01-24 21:14:24 UTC
The patch has some formatting only type changes that should be reduced, but looks ok overall.   I would also like to here some test results on systems that are NOT having this issue to insure it does not break anything there.   I brought this review up in the bhyve every other week meeting to get some more eyes on it.
Comment 8 Marcelo Araujo freebsd_committer freebsd_triage 2019-01-24 23:48:11 UTC
Sorry, I'm gonna put this bug report back to the pool, I'm sure Rodney will check it soon.
Comment 9 Rodney W. Grimes freebsd_committer freebsd_triage 2019-01-25 00:37:02 UTC
(In reply to Callum from comment #0)
Do you have a phabricator account on reviews.freebsd.org?  If so would you put your patch up in a review over there?  If not would you be either willing to set up one, or have me copy your patch to a review so we can move forward with fixing this issue?
Comment 10 Callum 2019-01-28 12:02:44 UTC
(In reply to Marcelo Araujo from comment #5)

> 1) bhyve command line
bhyve -AHP -S -u -c 4 -p 0:6 -p 1:7 -p 2:4 -p 3:5 -m 2G \
-s 0:0,hostbridge \
-s 1:0,lpc \
-s 2:0,virtio-blk,/dev/zvol/zroot/bhyve/tv \
-s 4:0,virtio-net,tap8 \
-s 5:0,virtio-net,tap9 \
-s 8:0,passthru,4/0/0 \
-s 9:0,passthru,5/0/0 \
-s 10:0,passthru,6/0/0 \
-s 11:0,passthru,7/0/0 \
-l com1,/dev/nmdm0A \
-l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd \
tv

> 2) CPU Type
E3-1275 v5

> 3) Guest OS USED
OpenSUSE Leap 15.0

> 4) Device used via passthrough
4x
class=0x0c0330 card=0x00151912 chip=0x00151912 rev=0x02 hdr=0x00
    vendor     = 'Renesas Technology Corp.'
    device     = 'uPD720202 USB 3.0 Host Controller'
    class      = serial bus
    subclass   = USB
Comment 11 Callum 2019-01-28 12:07:44 UTC
(In reply to Rodney W. Grimes from comment #9)

Submitted for review - D19001 (https://reviews.freebsd.org/D19001)
Comment 12 Niels Bakker 2019-01-28 16:31:21 UTC
Tested the patch on an i5-4690K with no immediate adverse affects.
Comment 13 Rodney W. Grimes freebsd_committer freebsd_triage 2019-01-28 17:32:51 UTC
(In reply to Niels Bakker from comment #12)
Are you passing through any devices?
Comment 14 Niels Bakker 2019-01-28 23:09:42 UTC
Yes, otherwise it wouldn't be a real test, wouldn't it? :-)

Specifically, I passed through an audio device which was recognised in the guest, both in the stock 12.0 kernel and in one with the patch attached to this PR applied on the host.
Comment 15 Brandon Martin 2019-06-04 03:09:47 UTC
thank you very much for this patch, saved me ages of head scratching!

just wanted to add another confirmed working report with rather nonstandard PfSense host and Ubuntu guest.  hopefully this can get merged soon.

host:
FreeBSD 11.2-RELEASE-p10 FreeBSD 11.2-RELEASE-p10 #9 4a2bfdce133(RELENG_2_4_4): Wed May 15 18:54:42 EDT 2019     root@buildbot1-nyi.netgate.com:/build/ce-crossbuild-244/obj/amd64/ZfGpH5cd/build/ce-crossbuild-244/pfSense/tmp/FreeBSD-src/sys/pfSense

kmod build host:
FreeBSD 11.2-RELEASE FreeBSD 11.2-RELEASE #0 r335510: Fri Jun 22 04:32:14 UTC 2018     root@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC

root@malibu:/usr/src # svn info
Path: .
Working Copy Root Path: /usr/src
URL: https://svn.freebsd.org/base/releng/11.2
Relative URL: ^/releng/11.2
Repository Root: https://svn.freebsd.org/base
Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
Revision: 348521
Node Kind: directory
Schedule: normal
Last Changed Author: gordon
Last Changed Rev: 347597
Last Changed Date: 2019-05-14 16:22:30 -0700 (Tue, 14 May 2019)

root@malibu:/usr/src # svn status
?       sys/amd64/vmm/intel/vmm.patch
M       sys/amd64/vmm/intel/vtd.c
?       sys/amd64/vmm/intel/vtd.c.orig

guest:
Linux 4.15.0-45-generic #48-Ubuntu SMP Tue Jan 29 16:28:13 UTC 2019

1) bhyve command line
bhyve -A -H -P -S -s 0:0,hostbridge -s 1:0,lpc -s 2:0,virtio-net,tap1 -s 3:0,virtio-blk,/opt/vm/img/homer.img" -l com1,stdio -c 1 -s 7,passthru,0/20/0 -m 1024M homer

2) CPU Type
CPU: Intel(R) Core(TM) i3-7100U CPU @ 2.40GHz (2400.11-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x806e9  Family=0x6  Model=0x8e  Stepping=9
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x7ffafbbf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,SDBG,FMA,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x121<LAHF,ABM,Prefetch>
  Structured Extended Features=0x29c67af<FSGSBASE,TSCADJ,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,NFPUSG,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PROCTRACE>
  Structured Extended Features3=0x9c002400<IBPB,STIBP,SSBD>
  XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
  TSC: P-state invariant, performance statistics

3) Guest OS USED :(
b@homer:~$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.2 LTS"

4) Device used via passthrough
host# grep ^ppt /boot/loader.conf
pptdevs="0/20/0"
host# pciconf -lv ppt0@
ppt0@pci0:0:20:0:       class=0x0c0330 card=0x72708086 chip=0x9d2f8086 rev=0x21 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Sunrise Point-LP USB 3.0 xHCI Controller'
    class      = serial bus
    subclass   = USB

b@homer:~$ lspci -vv -s 00:07.0
00:07.0 USB controller: Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller (rev 21) (prog-if 30 [XHCI])
        Subsystem: Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 32
        Region 0: Memory at c0010000 (64-bit, prefetchable) [size=64K]
        Capabilities: <access denied>
        Kernel driver in use: xhci_hcd
Comment 16 commit-hook freebsd_committer freebsd_triage 2019-06-19 06:41:44 UTC
A commit references this bug:

Author: scottl
Date: Wed Jun 19 06:41:07 UTC 2019
New revision: 349184
URL: https://svnweb.freebsd.org/changeset/base/349184

Log:
  Implement VT-d capability detection on chipsets that have multiple
  translation units with differing capabilities

  From the author via Bugzilla:
  ---
  When an attempt is made to passthrough a PCI device to a bhyve VM
  (causing initialisation of IOMMU) on certain Intel chipsets using
  VT-d the PCI bus stops working entirely. This issue occurs on the
  E3-1275 v5 processor on C236 chipset and has also been encountered
  by others on the forums with different hardware in the Skylake
  series.

  The chipset has two VT-d translation units. The issue is caused by
  an attempt to use the VT-d device-IOTLB capability that is
  supported by only the first unit for devices attached to the
  second unit which lacks that capability. Only the capabilities of
  the first unit are checked and are assumed to be the same for all
  units.

  Attached is a patch to rectify this issue by determining which
  unit is responsible for the device being added to a domain and
  then checking that unit's device-IOTLB capability. In addition to
  this a few fixes have been made to other instances where the first
  unit's capabilities are assumed for all units for domains they
  share. In these cases a mutual set of capabilities is determined.
  The patch should hopefully fix any bugs for current/future
  hardware with multiple translation units supporting different
  capabilities.

  A description is on the forums at
  https://forums.freebsd.org/threads/pci-passthrough-bhyve-usb-xhci.65235
  The thread includes observations by other users of the bug
  occurring, and description as well as confirmation of the fix.
  I'd also like to thank Ordoban for their help.

  ---
  Personally tested on a Skylake laptop, Skylake Xeon server, and
  a Xeon-D-1541, passing through XHCI and NVMe functions.  Passthru
  is hit-or-miss to the point of being unusable without this
  patch.

  PR: 229852
  Submitted by: callum@aitchison.org
  MFC after: 1 week

Changes:
  head/sys/amd64/vmm/intel/vtd.c
Comment 17 Scott Long freebsd_committer freebsd_triage 2019-06-19 06:45:31 UTC
Thanks a lot for submitting this fix, and thanks to everyone who tested it and reported back.
Comment 18 Yuri Pankov freebsd_committer freebsd_triage 2019-09-26 20:31:02 UTC
Looks like MFC was forgotten for this, is it possible to get this into 12.1 still?
Comment 19 Emrion 2019-11-17 18:38:15 UTC
What version of FreeBSD will be free of this bug? 12.2 in the better of the worlds?
Comment 20 Ka Ho Ng freebsd_committer freebsd_triage 2019-12-06 08:39:19 UTC
r349184 could fix the problem on my Skylake machine, but is not yet MFCed to 12-STABLE and 11-STABLE. Is there anyone care to do that? Thanks!
Comment 21 commit-hook freebsd_committer freebsd_triage 2019-12-06 09:50:40 UTC
A commit references this bug:

Author: scottl
Date: Fri Dec  6 09:50:01 UTC 2019
New revision: 355440
URL: https://svnweb.freebsd.org/changeset/base/355440

Log:
  MFC r349184.  This fixing PCI passthrough via VT-d on modern chipsets with
  multiple translation units.

  PR:		229852
  Submitted by:	callum@mitchison.org

Changes:
_U  stable/12/
  stable/12/sys/amd64/vmm/intel/vtd.c
Comment 22 Kubilay Kocak freebsd_committer freebsd_triage 2020-05-22 01:55:00 UTC
*** Bug 246647 has been marked as a duplicate of this bug. ***
Comment 23 Kubilay Kocak freebsd_committer freebsd_triage 2020-05-22 02:05:03 UTC
^Triage: Assign to committer that resolved
Comment 24 Ed Maste freebsd_committer freebsd_triage 2020-07-08 20:09:18 UTC
Author: gordon
Date: Wed Jul  8 19:56:34 2020
New Revision: 363022
URL: https://svnweb.freebsd.org/changeset/base/363022

Log:
  Fix host crash in bhyve with PCI device passthrough.

  Approved by:  so
  Security:     FreeBSD-EN-20:13.bhyve

Modified:
  releng/12.1/sys/amd64/vmm/intel/vtd.c
  releng/12.1/usr.sbin/bhyve/pci_emul.c
  releng/12.1/usr.sbin/bhyve/pci_emul.h
  releng/12.1/usr.sbin/bhyve/pci_passthru.c
Comment 25 Kubilay Kocak freebsd_committer freebsd_triage 2022-05-29 23:45:54 UTC
^Triage: Belatedly track all related issue metadata