Bug 263768

Summary: [bhyve] PCI passthru device not working after guest stop/start
Product: Base System Reporter: Anatoli <me>
Component: bhyveAssignee: freebsd-virtualization (Nobody) <virtualization>
Status: New ---    
Severity: Affects Only Me CC: bj.rn, heravinluca92, me, raul.munoz
Priority: ---    
Version: 13.0-RELEASE   
Hardware: amd64   
OS: Any   

Description Anatoli 2022-05-04 03:38:40 UTC
When bhyve starts a guest for the first time (since the host start) with a physical device via ppt, the device works as expected.

After a soft reboot of the guest (`sudo reboot` from within, without exiting the bhyve process) the device works as expected.

But after a stop / start of the guest (new bhyve process), the device is detected inside the guest but is not working anymore.

My guess is that for a correct initialization of the device it should be in some un-initialized state and after the first bhyve run, the guest OS/bhyve are not de-initializing it and it stays in some soft-broken state.

It seems similar to the bug #205549, but not the same. Like the inverse, i.e. the guest/bhyve don't do something to clean the device.


# On the host

$ uname -r
13.0-RELEASE-p11

$ sudo pciconf -r ppt0 0:0x3f
15ff8086 00100006 02000002 00800010
b600000c 00000000 00000000 b781800c
00000000 00000000 00000000 1b7b15d9
b9380000 00000040 00000000 00000144


# Guest started for the 1st time (after host reboot)

$ sudo vm start test

$ sudo pciconf -r ppt0 0:0x3f
15ff8086 00100406 02000002 00800010
b600000c 00000000 00000000 b781800c
00000000 00000000 00000000 1b7b15d9
00000000 00000040 00000000 0000010b


# Inside the guest

$ uname -a
OpenBSD test 7.0 GENERIC.MP#6 amd64

$ dmesg | grep ixl
ixl0 at pci0 dev 8 function 0 "Intel X710 10GBaseT" rev 0x02: port 0, FW 7.2.60285 API 1.9, msix, 4 queues, address 3c:ec:ef:21:3d:02

$ ifconfig ixl0
ixl0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
    lladdr 3c:ec:ef:21:3d:02
    index 2 priority 0 llprio 3
    media: Ethernet autoselect (1000baseT full-duplex)
    status: active
    inet 172.16.1.2 netmask 0xffffff00 broadcast 172.16.1.255

$ ping -c 1 172.16.1.1
PING 172.16.1.1 (172.16.1.1): 56 data bytes
64 bytes from 172.16.1.1: icmp_seq=0 ttl=255 time=0.594 ms

--- 172.16.1.1 ping statistics ---
1 packets transmitted, 1 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 0.594/0.594/0.594/0.000 ms


# After reboot of the guest (from within the guest: `sudo reboot`)

$ dmesg | grep ixl
ixl0 at pci0 dev 8 function 0 "Intel X710 10GBaseT" rev 0x02: port 0, FW 7.2.60285 API 1.9, msix, 4 queues, address 3c:ec:ef:21:3d:02

$ ifconfig ixl0
ixl0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> mtu 1500
    lladdr 3c:ec:ef:21:3d:02
    index 2 priority 0 llprio 3
    media: Ethernet autoselect (1000baseT full-duplex)
    status: active

$ ping -c 1 172.16.1.1
PING 172.16.1.1 (172.16.1.1): 56 data bytes
64 bytes from 172.16.1.1: icmp_seq=0 ttl=255 time=1.674 ms

--- 172.16.1.1 ping statistics ---
1 packets transmitted, 1 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 1.674/1.674/1.674/0.000 ms



# Guest started for the 2nd time (from within the guest: `shutdown -p now`, then `vm start test`)

# On the host (before and after guest start same values)
$ sudo pciconf -r ppt0 0:0x3f
15ff8086 00100406 02000002 00800010
b600000c 00000000 00000000 b781800c
00000000 00000000 00000000 1b7b15d9
00000000 00000040 00000000 0000010b


# Inside the guest

$ dmesg | rg ixl
ixl0 at pci0 dev 8 function 0 "Intel X710 10GBaseT" rev 0x02: port 0, FW 0.0.00000 API 0.0, port address is not valid
ixl0: no switch config available

$ ifconfig ixl0
ixl0: no such interface


# The same happens after more guest stops/starts


$ sudo vm config test
cpu=4
memory=2G

disk0_type="virtio-blk"
disk0_name="disk.img"

network0_type="virtio-net"
network0_switch="public"

passthru0="68/0/0=8:0"
wired_memory="yes"

loader="uefi"
hostbridge="amd"
virt_console0="yes"
graphics="no"
uuid="98abd580-c123-11ec-bd08-3cecef1c8ff2"
network0_mac="58:9c:fc:08:35:3b"

$ cat /boot/loader.conf
cryptodev_load="YES"
zfs_load="YES"
vmm_load="YES"
nmdm_load="YES"
if_tap_load="YES"
if_bridge_load="YES"
hw.vmm.amdvi.enable="1"
pptdevs="68/0/0 68/0/1"

$ cat /etc/rc.conf
zfs_enable="YES"
cloned_interfaces="bridge0 tap0"
ifconfig_bridge0="inet 192.168.55.2/24 addm tap0"
kld_list="nmdm vmm"
vm_enable="YES"
vm_dir="zfs:zroot/vm"


$ cat vm-bhyve.log
--- First start
May 03 23:15:06: initialising
May 03 23:15:06:  [loader: uefi]
May 03 23:15:06:  [cpu: 4]
May 03 23:15:06:  [memory: 2G]
May 03 23:15:06:  [hostbridge: amd]
May 03 23:15:06:  [com ports: com1]
May 03 23:15:06:  [uuid: 98abd580-c123-11ec-bd08-3cecef1c8ff2]
May 03 23:15:06:  [utctime: yes]
May 03 23:15:06:  [debug mode: no]
May 03 23:15:06:  [primary disk: disk.img]
May 03 23:15:06:  [primary disk dev: file]
May 03 23:15:06: initialising network device tap1
May 03 23:15:06: adding tap1 -> vm-public (public addm)
May 03 23:15:06: bring up tap1 -> vm-public (public addm)
May 03 23:15:06: booting
May 03 23:15:06:  [bhyve options: -c 4 -m 2G -Hwl bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd -S -U 98abd580-c123-11ec-bd08-3cecef1c8ff2 -u -S]
May 03 23:15:06:  [bhyve devices: -s 0,amd_hostbridge -s 31,lpc -s 4:0,virtio-blk,/vm/test/disk.img -s 5:0,virtio-net,tap1,mac=58:9c:fc:08:35:3b -s 8:0,passthru,68/0/0 -s 7:0,virtio-console,0=/vm/test/vtcon.0]
May 03 23:15:06:  [bhyve console: -l com1,/dev/nmdm-test.1A]
May 03 23:15:06:  [bhyve iso device: -s 3:0,ahci-cd,/vm/.config/null.iso]
May 03 23:15:06: starting bhyve (run 1)
May 03 23:16:50: bhyve exited with status 1
May 03 23:16:50: destroying network device tap1
May 03 23:16:51: stopped

--- Second start
May 03 23:17:26: initialising
May 03 23:17:26:  [loader: uefi]
May 03 23:17:26:  [cpu: 4]
May 03 23:17:26:  [memory: 2G]
May 03 23:17:26:  [hostbridge: amd]
May 03 23:17:26:  [com ports: com1]
May 03 23:17:26:  [uuid: 98abd580-c123-11ec-bd08-3cecef1c8ff2]
May 03 23:17:26:  [utctime: yes]
May 03 23:17:26:  [debug mode: no]
May 03 23:17:26:  [primary disk: disk.img]
May 03 23:17:26:  [primary disk dev: file]
May 03 23:17:26: initialising network device tap1
May 03 23:17:26: adding tap1 -> vm-public (public addm)
May 03 23:17:26: bring up tap1 -> vm-public (public addm)
May 03 23:17:26: booting
May 03 23:17:26:  [bhyve options: -c 4 -m 2G -Hwl bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd -S -U 98abd580-c123-11ec-bd08-3cecef1c8ff2 -u -S]
May 03 23:17:26:  [bhyve devices: -s 0,amd_hostbridge -s 31,lpc -s 4:0,virtio-blk,/vm/test/disk.img -s 5:0,virtio-net,tap1,mac=58:9c:fc:08:35:3b -s 8:0,passthru,68/0/0 -s 7:0,virtio-console,0=/vm/test/vtcon.0]
May 03 23:17:26:  [bhyve console: -l com1,/dev/nmdm-test.1A]
May 03 23:17:26:  [bhyve iso device: -s 3:0,ahci-cd,/vm/.config/null.iso]
May 03 23:17:26: starting bhyve (run 1)
Comment 1 heravinluca 2022-09-14 07:02:21 UTC
This article has significantly increased my motivation. I appreciate you sharing these.https://1v1battle.co