Bug 244363 - Assertion failed: error == 0, file pci_emul.c, line 517, function modify_bar_registration
Summary: Assertion failed: error == 0, file pci_emul.c, line 517, function modify_bar_...
Status: Closed Not A Bug
Alias: None
Product: Base System
Classification: Unclassified
Component: bhyve (show other bugs)
Version: 12.0-STABLE
Hardware: amd64 Any
: --- Affects Some People
Assignee: freebsd-virtualization (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-02-24 10:19 UTC by Jorge Schrauwen
Modified: 2020-02-25 05:19 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jorge Schrauwen 2020-02-24 10:19:37 UTC
I'm running into an issue when using PCI passthrough with bhyve.

I am running into this on SmartOS but when googling the problem I found a few FreeBSD users also running into the same problem. After chatting with Michael Dexter, I as suggested to also fine a bug bere.


List of other people hitting the problem on FreeBSD:
- https://forums.freebsd.org/threads/vm-bhyve-windows-2012-r2-and-passthru.60832/
- https://www.ixsystems.com/community/threads/bhyve-pci-passthrough-errors.57284/
- https://forums.freebsd.org/threads/bhyve-passthru-issue-with-lsi-logic-sas2008-- pci-express-fusion-mpt-sas-2.65269/
- bug #211062, comment #8

SmartOS bug report:
- https://github.com/joyent/smartos-live/issues/901

Bhyve is dumping core when the following assert is tripped:
Assertion failed: error == 0, file pci_emul.c, line 517, function modify_bar_registration

This is only tripped when the guest OS is windows (10). I do not have issues with a freebsd, illumos, or linux guest. So something windows is doing triggeres it.

I also noticed I am only getting this on PCIe devices that have multiple BAR entries.

I captured the output in a FreeBSD guest that got the PCIe devices via passthrough, you can see they have 2 BARs. ( Found of no easy way to capture this info on SmartOS, but this is probably easier for here anyway )

```
ixl0@pci0:0:8:0:        class=0x020000 card=0x37d215d9 chip=0x37d28086 rev=0x09 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Connection X722 for 10GBASE-T'
    class      = network
    subclass   = ethernet
    bar   [10] = type Prefetchable Memory, range 64, base rxc1000000, size 16777216, enabled
    bar   [1c] = type Prefetchable Memory, range 64, base rxc2000000, size 32768, enabled
    cap 01[40] = powerspec 3  supports D0 D3  current D0
    cap 05[50] = MSI supports 1 message, 64 bit, vector masks
    cap 11[70] = MSI-X supports 8 messages, enabled
                 Table in map 0x1c[0x0], PBA in map 0x1c[0x1000]
    cap 10[a0] = PCI-Express 2 endpoint max data 256(512) FLR RO
                 link x1(x1) speed 2.5(2.5) ASPM disabled(L0s/L1)
    cap 03[e0] = VPD
    VPD ident  = 'Example VPD'
ixl1@pci0:0:8:1:        class=0x020000 card=0x37d215d9 chip=0x37d28086 rev=0x09 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Connection X722 for 10GBASE-T'
    class      = network
    subclass   = ethernet
    bar   [10] = type Prefetchable Memory, range 64, base rxc3000000, size 16777216, enabled
    bar   [1c] = type Prefetchable Memory, range 64, base rxc4000000, size 32768, enabled
    cap 01[40] = powerspec 3  supports D0 D3  current D0
    cap 05[50] = MSI supports 1 message, 64 bit, vector masks
    cap 11[70] = MSI-X supports 8 messages, enabled
                 Table in map 0x1c[0x0], PBA in map 0x1c[0x1000]
    cap 10[a0] = PCI-Express 2 endpoint max data 256(512) FLR RO
                 link x1(x1) speed 2.5(2.5) ASPM disabled(L0s/L1)
    cap 03[e0] = VPD
    VPD ident  = 'Example VPD'
```

Michael Zeller who is also on the bhyve weekly called help me debug this a bit (on SmartOS)

We traced it to `Also the call to unregister_mem() ends up seeing an ENOENT from mmio_rb_lookup(&mmio_rb_root, memp->base, &entry);`

mmio_rb_lookup is macro that we (illumos) copied from FreeBSD and it wasn't changed. So it would make sense that there is indeed a bug in there both FreeBSd and SmartOS bhyve users would hit the same error.
Comment 1 Peter Grehan freebsd_committer freebsd_triage 2020-02-24 11:07:25 UTC
This was fixed in FreeBSD with
   https://svnweb.freebsd.org/base?view=revision&revision=348779
Comment 2 Jorge Schrauwen 2020-02-24 13:05:04 UTC
Interesting, as we do have these changes in the SmartOS bhyve tree too...
I wonder if that somehow got mismerged. I have poked Micheal Zeller on the SmartOS side again.
Comment 3 Jorge Schrauwen 2020-02-24 22:03:14 UTC
Thanks Peter,

I fixed it on SmartOS by applying most of the diff to pci_passthru.c, it's working now!