Bug 205549 - bhyve pci passthru stops working after guest is restarted
Summary: bhyve pci passthru stops working after guest is restarted
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: bhyve (show other bugs)
Version: CURRENT
Hardware: amd64 Any
: --- Affects Some People
Assignee: freebsd-virtualization mailing list
URL: https://reviews.freebsd.org/D20623
Keywords:
Depends on:
Blocks:
 
Reported: 2015-12-23 19:32 UTC by Sergey Manucharian
Modified: 2019-06-27 15:22 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sergey Manucharian 2015-12-23 19:32:15 UTC
This is 100% reproducible in Thinkpad T430 with FreeBSD 11-CURRENT base r292595 (Dec 22, 2015).

I pass trough a PCI device (USB controller) to a Linux guest. When I boot
Linux VM, it fully controls the USB controller and those USB ports, and I see e.g. the USB flash in Linux when plug it in. As soon as I shut down Linux, the host takes the control over, and the USB flash drive appears as /dev/da0 in FreeBSD. 

The bhyve guest (re)started then does has no control on that PCI device although it shows it:

$ lspci -v
00:07.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset
Family USB xHCI Host Controller (rev 04) (prog-if 30 [XHCI])
        Subsystem: Lenovo Device 21f3
        Flags: bus master, medium devsel, latency 0, IRQ 24
        Memory at c0010000 (64-bit, prefetchable) [size=64K]
        Capabilities: [70] Power Management version 2
        Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
        Kernel driver in use: xhci_hcd

$ lsusb
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

This behavior does not depend on the time when the USB device is plugged: before guest is booted or after, unplugged before shutdown or after.

The host FreeBSD has the same device description before or after the issue:

$ pciconf -vl
....
ppt0@pci0:0:20:0:       class=0x0c0330 card=0x21f317aa chip=0x1e318086
rev=0x04 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '7 Series/C210 Series Chipset Family USB xHCI Host
Controller'
    class      = serial bus
    subclass   = USB
....

When the USB device appears in the host device list it looks as:

$ devinfo | grep -B15 mass
pcib1
  pci1
    sdhci_pci0
pcib2
  pci2
    iwn0
pcib3
  pci3
ehci1
  usbus1
    uhub1
      uhub3
        umass0    <==

Since it's a XHCI controller I tried to run FreeBSD kernel with xhci driver not compiled-in. It does not help. Most likely the USB controller can be also handled by ehci driver.
Comment 1 Sergey Manucharian 2016-01-23 07:45:18 UTC
Update:

The issue does not appear with MS Windows guest. I can start and halt Windows 2012 R2 many times, and bhyve always pass through that USB controller to the guest.

The obvious differences in the options with Linux guest are:
Linux:   virtio-blk
Windows: ahci-hd
Linux:   grub-bhyve
Windows: bhyve_uefi_20151002.fd
Comment 2 arkadyi 2019-01-15 11:36:15 UTC
I have the same problem on ThinkPad T530 with FreeBSD 12.0-RELEASE-p2 r342912.
Custom kernel without xhci.
ppt0@pci0:0:20:0:       class=0x0c0330 card=0x21f617aa chip=0x1e318086 rev=0x04 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '7 Series/C210 Series Chipset Family USB xHCI Host Controller'
    class      = serial bus
    subclass   = USB
When I first start a Linux virtual machine, everything works fine.
If the Linux VM is turned off correctly, then when I restart(second start) Linux VM , I see the same problem.

If I destroy Linux VM(I mean kill process bhyve), second start Linux VM work without problem.

I do not see this, if I use Windows VM(Windows 7).
Comment 3 John Baldwin freebsd_committer freebsd_triage 2019-06-05 23:12:03 UTC
So it sounds like Linux is doing something during a clean shutdown to "shut" the xhci device down in a way that on boot doesn't get re-enabled.  What I would do perhaps is start by comparing the first 64 bytes of PCI config space of the device on the host before booting the Linux VM and after the clean shutdown of the Linux VM.  You can get a copy of the config registers by doing 'pciconf -r ppt0 0:0x3f'.
Comment 4 arkadyi 2019-06-07 10:02:04 UTC
Before starting Linux VM:
pciconf -r ppt0 0:0x3f
1e318086 02900006 0c033004 00000000
f2520004 00000000 00000000 00000000 
00000000 00000000 00000000 21f617aa
00000000 00000070 00000000 00000110 

After 'shutdown -P now' Linux VM:
pciconf -r ppt0 0:0x3f
1e318086 02900002 0c033004 00000000
f2520004 00000000 00000000 00000000 
00000000 00000000 00000000 21f617aa
00000000 00000070 00000000 00000110

Yes. There are changes. 02900006 and 02900002.
What other information is needed?
Comment 5 John Baldwin freebsd_committer freebsd_triage 2019-06-12 17:22:46 UTC
Linux has disabled the busmaster enable bit.  This should be pretty easy to fix.  Please try the patch from the phabricator URL and let me know if it helps.
Comment 6 arkadyi 2019-06-13 18:36:18 UTC
Hmm....I don't see in the file usr.sbin/bhyve/pci_passthru.c (origin) bellow strings
642	pci_set_cfgdata16(pi, PCIR_COMMAND, read_config(&sc->psc_sel,
643	    PCIR_COMMAND, 2));

# patch < D20623.diff 
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: usr.sbin/bhyve/pci_passthru.c
|===================================================================
|--- usr.sbin/bhyve/pci_passthru.c
|+++ usr.sbin/bhyve/pci_passthru.c
--------------------------
Patching file pci_passthru.c using Plan A...
Hunk #1 succeeded at 615 (offset -3 lines).
Hunk #2 succeeded at 625 (offset -3 lines).
Hunk #3 failed at 648.
1 out of 3 hunks failed--saving rejects to pci_passthru.c.rej
done

cat /usr/src/usr.sbin/bhyve/pci_passthru.c.rej
@@ -636,8 +648,13 @@
                goto done;
        }
 
-       pci_set_cfgdata16(pi, PCIR_COMMAND, read_config(&sc->psc_sel,
-           PCIR_COMMAND, 2));
+       /*
+        * Fetch the updated virtual command register and write it to
+        * the device if needed.
+        */
+       cmd = pci_get_cfgdata16(pi, PCIR_COMMAND);
+       if (cmd != orig_cmd)
+               write_config(&sc->psc_sel, PCIR_COMMAND, 2, cmd);
 
        error = 0;                              /* success */
 done:

uname -srv
FreeBSD 12.0-RELEASE-p5 FreeBSD 12.0-RELEASE-p5 r349013
Comment 7 John Baldwin freebsd_committer freebsd_triage 2019-06-14 18:40:17 UTC
The patch is relative to head.  For 12.0 you will need to apply the patch from https://svnweb.freebsd.org/base?view=revision&revision=348779 first and then apply this patch on top of that.
Comment 8 John Baldwin freebsd_committer freebsd_triage 2019-06-14 18:41:30 UTC
Actually, you need the diff from https://svnweb.freebsd.org/base?view=revision&revision=348778 as well.
Comment 9 julien M 2019-06-16 20:17:56 UTC
Having the same issue here with: FreeBSD 12.0-STABLE r349117 GENERIC  amd64
(Generic kernel WITH xhci support)

I patched stable with revisions 348778 and 348779 + D20623.

FIRST TRY (vm create + vm install):
BEFORE installing Ubuntu 18.04 on Bhyve:
$ pciconf -r ppt4 0x3f

145c1022 00100007 0c033000 00800010
d9b00004 00000000 00000000 00000000
00000000 00000000 00000000 50071458
00000000 00000048 00000000 00000325

AFTER REBOOT:
$ pciconf -r ppt4 0x3f

145c1022 00100407 0c033000 00800010
d9b00004 00000000 00000000 00000000
00000000 00000000 00000000 50071458
00000000 00000048 00000000 00000325

CHANGES: 00100007 => 00100407

SECOND TRY (vm destroy, vm create, vm install):
AFTER issuing: vm install linux linux.iso (while ubuntu is installing):
$ pciconf -r ppt4 0x3f

145c1022 10100407 0c033000 00800010
d9b00004 00000000 00000000 00000000
00000000 00000000 00000000 50071458
00000000 00000048 00000000 00000325

CHANGES: 00100407 => 10100407

THIRD TRY (vm destroy, vm create, vm install):
AFTER issuing: vm install ...:
$ pciconf -r ppt4 0x3f

145c1022 10100407 0c033000 00800010
d9b00004 00000000 00000000 00000000
00000000 00000000 00000000 50071458
00000000 00000048 00000000 00000325

CHANGES: NONE

It seems that the FIRST install is breaking everything by changing 00100007 to 00100407. After a FreeBSD reboot, the 00100007 is back and passthrough is working when installing for the FIRST time linux in bhyve.


This is the device to be passed:
ppt2@pci0:8:0:0:        class=0x130000 card=0x145a1022 chip=0x145a1022 rev=0x00 hdr=0x00
    vendor     = 'Advanced Micro Devices, Inc. [AMD]'
    device     = 'Zeppelin/Raven/Raven2 PCIe Dummy Function'
    class      = non-essential instrumentation
ppt3@pci0:8:0:2:        class=0x108000 card=0x14561022 chip=0x14561022 rev=0x00 hdr=0x00
    vendor     = 'Advanced Micro Devices, Inc. [AMD]'
    device     = 'Family 17h (Models 00h-0fh) Platform Security Processor'
    class      = encrypt/decrypt
ppt4@pci0:8:0:3:        class=0x0c0330 card=0x50071458 chip=0x145c1022 rev=0x00 hdr=0x00
    vendor     = 'Advanced Micro Devices, Inc. [AMD]'
    device     = 'Family 17h (Models 00h-0fh) USB 3.0 Host Controller'
    class      = serial bus
    subclass   = USB

Configuration for bhyve-vm:
...
passthru0="8/0/0"
passthru1="8/0/2"
passthru2="8/0/3"
...

I might be able to test this configuration in FreeBSD-CURRENT if that's of any help? (compile times are low on a threadripper 1950x).
Comment 10 John Baldwin freebsd_committer freebsd_triage 2019-06-17 13:38:43 UTC
(In reply to julien M from comment #9)
So this is a different variation and not quite the same as you are passing through different devices.  Can you clarify if any of these devices work currently and are broken by the proposed patch, or if the situation is the same with the proposed patch?  In your case the bit being changed is different (the bit to disable INTx interrupts is being set by the guest).

Can you provide 'pciconf -lc' output on the host for these three devices?
Comment 11 John Baldwin freebsd_committer freebsd_triage 2019-06-17 23:01:25 UTC
Actually, I've updated D20623 to always clear INTxDIS on guest start which I think will fix the issue in comment 9 as well if you are able to retest.
Comment 12 arkadyi 2019-06-20 13:53:17 UTC
Maybe I misunderstood you.
1. First I patched pci_passthru.c < diff from 348779 < < D20623.diff
2. Second I patched pci_emul.c < diff from 348778

But bhyve can't build. Please see log bellow:

cc  -O2 -pipe -march=core2  -I/usr/src/sys -DINET -DINET6 -I/usr/src/sys/dev/e1000 -I/usr/src/sys/dev/mii -I/usr/src/sys/dev/usb/controller -g -O0 -MD  -MF.depend.pci_passthru.o -MTpci_passthru.o -std=gnu99 -fstack-protector-strong -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -Wno-empty-body -Wno-string-plus-int -Wno-unused-const-variable -Wno-tautological-compare -Wno-unused-value -Wno-parentheses-equality -Wno-unused-function -Wno-enum-conversion -Wno-unused-local-typedef -Wno-address-of-packed-member -Wno-switch -Wno-switch-enum -Wno-knr-promoted-parameter  -Qunused-arguments  -c /usr/src/usr.sbin/bhyve/pci_passthru.c -o pci_passthru.o
/usr/src/usr.sbin/bhyve/pci_passthru.c:700:6: error: implicit declaration of function 'caph_rights_limit' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
        if (caph_rights_limit(pcifd, &rights) == -1)
            ^
/usr/src/usr.sbin/bhyve/pci_passthru.c:700:6: note: did you mean 'cap_rights_limit'?
/usr/src/sys/sys/capsicum.h:509:5: note: 'cap_rights_limit' declared here
int cap_rights_limit(int fd, const cap_rights_t *rights);
    ^
/usr/src/usr.sbin/bhyve/pci_passthru.c:702:6: error: implicit declaration of function 'caph_ioctls_limit' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
        if (caph_ioctls_limit(pcifd, pci_ioctls, nitems(pci_ioctls)) == -1)
            ^
/usr/src/usr.sbin/bhyve/pci_passthru.c:702:6: note: did you mean 'cap_ioctls_limit'?
/usr/src/sys/sys/capsicum.h:519:5: note: 'cap_ioctls_limit' declared here
int cap_ioctls_limit(int fd, const cap_ioctl_t *cmds, size_t ncmds);
    ^
/usr/src/usr.sbin/bhyve/pci_passthru.c:901:3: error: implicit declaration of function 'pci_emul_cmd_changed' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
                pci_emul_cmd_changed(pi, cmd_old);
                ^
3 errors generated.
*** Error code 1

Stop.
make: stopped in /usr/src/usr.sbin/bhyve
Comment 13 julien M 2019-06-23 13:44:56 UTC
(In reply to arkadyi from comment #12)
If you are running 12.0, you have to apply the patches in that order:
 1) patch with revision 348778 
 2) patch with revision 348779
 3) patch with D20623


(In reply to  John Baldwin from comment #11)
Retested with your last revision. The problem still exists.

BEFORE installing Ubuntu 18.04 on Bhyve (first launch):
$ pciconf -r ppt4 0x3f

145c1022 00100007 0c033000 00800010
d9b00004 00000000 00000000 00000000
00000000 00000000 00000000 50071458
00000000 00000048 00000000 00000325

WHILE the install is running (and passthru is working):
$ pciconf -r ppt4 0x3f

145c1022 00100407 0c033000 00800010
d9b00004 00000000 00000000 00000000
00000000 00000000 00000000 50071458
00000000 00000048 00000000 00000325

AFTER VM POWER OFF:
$ pciconf -r ppt4 0x3f

145c1022 00100003 0c033000 00800010
d9b00004 00000000 00000000 00000000
00000000 00000000 00000000 50071458
00000000 00000048 00000000 00000325


AFTER VM START (for the 2nd time):
$ pciconf -r ppt4 0x3f

145c1022 00100407 0c033000 00800010
d9b00004 00000000 00000000 00000000
00000000 00000000 00000000 50071458
00000000 00000048 00000000 00000325

CHANGES: 00100007 => 00100407

I don't know if I'm missing something? Patches all succeeded except one which had to be modified manually. Kernel + world were rebuilt and reinstalled (even though that's overkill).


PS: 
In my case, the 3 passthroughs are just meant to get the USB passthru for mouse and keyboard.

This is the device to be passed:
ppt2@pci0:8:0:0:        class=0x130000 card=0x145a1022 chip=0x145a1022 rev=0x00 hdr=0x00
    vendor     = 'Advanced Micro Devices, Inc. [AMD]'
    device     = 'Zeppelin/Raven/Raven2 PCIe Dummy Function'
    class      = non-essential instrumentation
ppt3@pci0:8:0:2:        class=0x108000 card=0x14561022 chip=0x14561022 rev=0x00 hdr=0x00
    vendor     = 'Advanced Micro Devices, Inc. [AMD]'
    device     = 'Family 17h (Models 00h-0fh) Platform Security Processor'
    class      = encrypt/decrypt
ppt4@pci0:8:0:3:        class=0x0c0330 card=0x50071458 chip=0x145c1022 rev=0x00 hdr=0x00
    vendor     = 'Advanced Micro Devices, Inc. [AMD]'
    device     = 'Family 17h (Models 00h-0fh) USB 3.0 Host Controller'
    class      = serial bus
    subclass   = USB

The only usefull one is then pci0:8:0:3 but I was under the impression that you have to pass everything or nothing, not just one "device" of the group "8:0:X".
Comment 14 julien M 2019-06-23 14:22:23 UTC
(In reply to John Baldwin from comment #10)
Sorry, forgot the output:

$ pciconf -lc

ppt2@pci0:8:0:0:        class=0x130000 card=0x145a1022 chip=0x145a1022 rev=0x00
hdr=0x00
    cap 09[48] = vendor (length 8)
    cap 01[50] = powerspec 3  supports D0 D3  current D0
    cap 10[64] = PCI-Express 2 endpoint max data 256(256) RO NS
                 link x16(x16) speed 8.0(8.0) ASPM disabled(L0s/L1)
    ecap 000b[100] = Vendor 1 ID 1
    ecap 0001[150] = AER 2 0 fatal 0 non-fatal 1 corrected
    ecap 0019[270] = PCIe Sec 1 lane errors 0
    ecap 000d[2a0] = ACS 1

ppt3@pci0:8:0:2:        class=0x108000 card=0x14561022 chip=0x14561022 rev=0x00
hdr=0x00
    cap 09[48] = vendor (length 8)
    cap 01[50] = powerspec 3  supports D0 D3  current D0
    cap 10[64] = PCI-Express 2 endpoint max data 256(256) RO NS
                 link x16(x16) speed 8.0(8.0) ASPM disabled(L0s/L1)
    cap 05[a0] = MSI supports 2 messages, 64 bit
    cap 11[c0] = MSI-X supports 2 messages, enabled
                 Table in map 0x24[0x0], PBA in map 0x24[0x1000]
    ecap 000b[100] = Vendor 1 ID 1
    ecap 0001[150] = AER 2 0 fatal 0 non-fatal 1 corrected
    ecap 000d[2a0] = ACS 1

ppt4@pci0:8:0:3:        class=0x0c0330 card=0x50071458 chip=0x145c1022 rev=0x00
hdr=0x00
    cap 09[48] = vendor (length 8)
    cap 01[50] = powerspec 3  supports D0 D3  current D0
    cap 10[64] = PCI-Express 2 endpoint max data 256(256) RO NS
                 link x16(x16) speed 8.0(8.0) ASPM disabled(L0s/L1)
    cap 05[a0] = MSI supports 8 messages, 64 bit enabled with 1 message
    ecap 000b[100] = Vendor 1 ID 1
    ecap 0001[150] = AER 2 0 fatal 0 non-fatal 1 corrected
    ecap 000d[2a0] = ACS 1
Comment 15 John Baldwin freebsd_committer freebsd_triage 2019-06-25 16:57:57 UTC
You should be able to just pass through the single USB controller AFAIK.

The effect of the patch is somewhat hard to measure since it changes the initial value while the VM is starting before the guest starts executing.  While the guest is executing and MSI interrupts are enabled, that 0x400 bit should be set (it disables legacy INTx interrupts).

However, to back up a bit, can you clarify how the device does not work on the Linux guest, i.e. does Linux attach a driver to the device but not see devices that are plugged in?

Another thing to check is if you can use devctl to change the device driver to xhci so that it uses the host driver and see if that works?  You can use a GENERIC kernel and use 'devctl set driver -f xhci0 ppt' to change it to passthrough after boot and then use 'devctl clear driver -f ppt0' to switch it back to xhci0 after shutting down the guest.

Another test that might be easier is to boot windows after booting Linux and seeing if the controller works fine in Windows.
Comment 16 arkadyi 2019-06-27 15:22:19 UTC
Thanks for the help julien!
I patched and builded without problem.
But problem persist.
Before start VM lInux
pciconf -r ppt0 0:0x3f
1e318086 02900006 0c033004 00000000
f2520004 00000000 00000000 00000000 
00000000 00000000 00000000 21f617aa
00000000 00000070 00000000 00000110 
After start/shutdown VM Linux
pciconf -r ppt0 0:0x3f
1e318086 02900002 0c033004 00000000
f2520004 00000000 00000000 00000000 
00000000 00000000 00000000 21f617aa
00000000 00000070 00000000 00000110
No resolve problem....