Bug 251046 - bhyve PCI passthrough does not work inside jail
Summary: bhyve PCI passthrough does not work inside jail
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.2-RELEASE
Hardware: amd64 Any
: --- Affects Many People
Assignee: Mark Johnston
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-11-11 11:58 UTC by neirac
Modified: 2021-11-29 23:15 UTC (History)
13 users (show)

See Also:


Attachments
Patch to make bhyve pci passthru work inside a jail (423 bytes, patch)
2020-11-11 11:58 UTC, neirac
no flags Details | Diff
devfs.rules use for testing (222 bytes, text/plain)
2020-11-11 11:58 UTC, neirac
no flags Details
jail.conf used for testing (532 bytes, text/plain)
2020-11-11 11:59 UTC, neirac
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description neirac 2020-11-11 11:58:02 UTC
Created attachment 219550 [details]
Patch to make bhyve pci passthru work inside a jail

on FreeBSD 12.2-RELEASE is possible to run bhyve inside a jail, but testing 
passthru revealed that this not work inside a jail.
The problem is that the jail needs r/w access to the following devices:

* /dev/mem
* /dev/io
* /dev/pci

/dev/pci is not a problem with a securelevel < 0. But the rest needs a patch.
I have attached a rough patch to make bhyve pci passthrough work inside a jail,I also have attached jail.conf and devfs.rules used for testing. 
This was tested in jail using vnet.
Comment 1 neirac 2020-11-11 11:58:45 UTC
Created attachment 219551 [details]
devfs.rules use for testing
Comment 2 neirac 2020-11-11 11:59:17 UTC
Created attachment 219552 [details]
jail.conf used for testing
Comment 3 Mark Johnston freebsd_committer freebsd_triage 2020-11-23 16:11:27 UTC
PRIV_IO access is not required only by /dev/io, it is also required for sysarch(I386_SET_IOPERM), which is otherwise available to jailed processes.  So the patch definitely should not be committed.  A better solution would be to extend pci(4) so that bhyve can use it to do everything required for PCI passthrough.  Even then I'm not sure why it's useful to jail the bhyve process - what does it buy you?
Comment 4 Shawn Webb 2020-11-23 17:35:02 UTC
(In reply to Mark Johnston from comment #3)
> Even then I'm not sure why it's useful to jail the bhyve process - what does it buy you?

It allows folks to have a production version of bhyve in the host, and to develop and test the userland components of bhyve in a jail.
Comment 5 Peter Wemm freebsd_committer freebsd_triage 2021-04-08 21:03:49 UTC
Perhaps I'm missing the point, but rather than punching a giant hole in the jail security model (ie: by giving unconstrained ring 0 / kernel privileges to jailed processes), would it not be better to run these development/testing bhyve userland components in a simple chroot environment?
Comment 6 neirac 2021-04-09 13:05:37 UTC
(In reply to Peter Wemm from comment #5)

Currently illumos is able to do pci-passthrough with bhyve running inside a zone, that gives you an extra layer of security,if there is escape from the hypervisor then the attacker will land on a jail and not the host system.
 
Here are relevant links on how is used on illumos :

https://movementarian.org/blog/posts/2018-10-26-pci-pass-through-support-with-bhyve-and-smartos/

https://www.cyber-tec.org/2019/05/29/using-bhyve-pci-passthrough-on-omnios/

I think it would be nice to have this feature on FreeBSD jails, as Mark stated  "better solution would be to extend pci(4) so that bhyve can use it to do everything required for PCI passthrough."
I would like to explore this option any pointer on how to start would be really good.
Comment 7 Anatoli 2021-05-30 12:38:53 UTC
Hi All,

> Even then I'm not sure why it's useful to jail the bhyve process - what does it buy you?

The idea to run bhyve inside jail is to provide an additional layer of security for potential vm-escape vulnerabilities in bhyve.

This is the way VMs are executed on Linux (protected by AppArmor and SEL) and Illumos.

Currently it's possible to run bhyve in jail, but not with PCI passthrough.

> A better solution would be to extend pci(4) so that bhyve can use it to do everything required for PCI passthrough.

Mark, could you please give us a hint on what should be done to extend pci(4) so jail changes are not needed? We are willing to implement this, but need some guidance.

One more security improvement that bhyve needs is to run it without root, but that's another story for another report.
Comment 8 Mark Johnston freebsd_committer freebsd_triage 2021-07-26 16:58:17 UTC
Sorry for the delayed follow up.  I wrote some patches to remove the need for /dev/io:

https://reviews.freebsd.org/D31307
https://reviews.freebsd.org/D31308

Testing would be appreciated.  This does not remove the dependency on /dev/mem yet.

I am very skeptical that jailing bhyve with PCI passthrough enabled provides any meaningful security.  /dev/pci allows a jailed root to access all PCI(e) devices in the system.  Jails can be a useful deployment mechanism though, so I think we should better support their integration with bhyve.
Comment 9 Bjoern A. Zeeb freebsd_committer freebsd_triage 2021-07-26 18:34:00 UTC
The /dev/mem ones could probably be PCIOCBARMMAP if that could be/is locked down enough?  But I assume all the checks needed are in place (now) somewhere for the IO ioctl?
Comment 10 Mark Johnston freebsd_committer freebsd_triage 2021-07-27 13:44:42 UTC
(In reply to Bjoern A. Zeeb from comment #9)
> The /dev/mem ones could probably be PCIOCBARMMAP if that could be/is locked down enough?

Yes, it seems possible.

> But I assume all the checks needed are in place (now) somewhere for the IO ioctl?

I'm not sure if I understand the question.  The new ioctl limits accesses to the specified BAR and verifies that the accesses is within bounds.  The /dev/io interface permits access to any system I/O port.
Comment 11 commit-hook freebsd_committer freebsd_triage 2021-08-14 15:01:37 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=42375556e5b2e68746d999b43d124040b6affb91

commit 42375556e5b2e68746d999b43d124040b6affb91
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2021-08-14 14:42:34 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2021-08-14 14:59:04 +0000

    bhyve: Use pci(4) to access I/O port BARs

    This removes the dependency on /dev/io.

    PR:             251046
    Reviewed by:    jhb
    MFC after:      2 weeks
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D31308

 usr.sbin/bhyve/pci_passthru.c | 65 ++++++++++++++++++-------------------------
 1 file changed, 27 insertions(+), 38 deletions(-)
Comment 12 Mark Johnston freebsd_committer freebsd_triage 2021-08-14 15:20:58 UTC
Review for removal of /dev/mem: https://reviews.freebsd.org/D31534
Comment 13 Anatoli 2021-08-16 00:23:12 UTC
(In reply to Mark Johnston from comment #12)

Mark thanks for your commit.

If I understand it correctly, bhyve should also be modified to not require PRIV_KMEM_WRITE when doing PPT, right?
Comment 14 Mark Johnston freebsd_committer freebsd_triage 2021-08-16 00:30:41 UTC
(In reply to Anatoli from comment #13)
Right, that's handled by the revision linked in comment 12.
Comment 15 Anatoli 2021-08-25 00:05:27 UTC
Mark, All,

> --- Comment #3 from Mark Johnston <markj@FreeBSD.org> ---
> PRIV_IO access is not required only by /dev/io, it is also required for
> sysarch(I386_SET_IOPERM), which is otherwise available to jailed processes. So
> the patch definitely should not be committed.  A better solution would be to
> extend pci(4) so that bhyve can use it to do everything required for PCI
> passthrough.  Even then I'm not sure why it's useful to jail the bhyve process
> - what does it buy you?

In light of the recently patched VM-escape vulnerability in bhyve
(FreeBSD-SA-21:13.bhyve fixing the CVE-2021-29631), I'd like to highlight the
benefits of running bhyve under a non-root user and inside a jail by default.

If it were the case, this vulnerability, instead of a complete host takeover
would just have a DoS impact on the malicious VM, which is perfectly fine IMO.

That's why it's extremely important to make bhyve work correctly under all
situations (including PPT) inside jail so we could make it run inside jail by
default.


> --- Comment #8 from Mark Johnston <markj@FreeBSD.org> ---
> I am very skeptical that jailing bhyve with PCI passthrough enabled provides
> any meaningful security.  /dev/pci allows a jailed root to access all PCI(e)
> devices in the system. Jails can be a useful deployment mechanism though, so I
> think we should better support their integration with bhyve.

With respect to this, isn't it possible to restrict the bhyve process (maybe
self-restricting via Capsicum) to just the masked PCI addresses or to the PCI
addresses specified via the args so to limit the impact of a bhyve compromise to
just the intended device(s)?

Or, as you already proposed, to extend pci(4) so that bhyve can use it to do
everything required for PPT?

Regards,
Anatoli
Comment 16 commit-hook freebsd_committer freebsd_triage 2021-08-29 16:40:29 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=c53f23984220bfd34c198966874c369006ea3246

commit c53f23984220bfd34c198966874c369006ea3246
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2021-08-14 14:42:34 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2021-08-29 16:39:53 +0000

    bhyve: Use pci(4) to access I/O port BARs

    This removes the dependency on /dev/io.

    PR:             251046
    Reviewed by:    jhb
    Sponsored by:   The FreeBSD Foundation

    (cherry picked from commit 42375556e5b2e68746d999b43d124040b6affb91)

 usr.sbin/bhyve/pci_passthru.c | 65 ++++++++++++++++++-------------------------
 1 file changed, 27 insertions(+), 38 deletions(-)
Comment 17 commit-hook freebsd_committer freebsd_triage 2021-11-15 18:12:30 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=e002d882ac2094047a8d5a8bef9252e5006b5828

commit e002d882ac2094047a8d5a8bef9252e5006b5828
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2021-10-09 15:36:19 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2021-11-15 18:11:52 +0000

    bhyve: Map the MSI-X table unconditionally for passthrough

    It is possible for the PBA to reside in the same page as the MSI-X
    table.  And, while devices are not supposed to do this, at least some
    Intel wifi devices place registers in a page shared with the MSI-X
    table.  To handle the first case we currently map the PBA page using
    /dev/mem, and the second case is not handled.

    Kill two birds with one stone: map the MSI-X table BAR using the
    PCIOCBARMMAP ioctl instead of /dev/mem, and map the entire table so that
    accesses beyond the bounds of the table can be emulated.  Regions of the
    BAR not containing the table are left unmapped.

    PR:             251046
    Reviewed by:    bz, grehan, jhb
    Sponsored by:   The FreeBSD Foundation

    (cherry picked from commit 7fa2335347362378322a4d27cb40f6e6cd5dd0fb)

 usr.sbin/bhyve/pci_emul.h     |   4 +-
 usr.sbin/bhyve/pci_passthru.c | 186 +++++++++++++++++-------------------------
 2 files changed, 76 insertions(+), 114 deletions(-)
Comment 18 Mark Johnston freebsd_committer freebsd_triage 2021-11-29 23:15:03 UTC
I've removed the dependency on /dev/mem and /dev/io in bhyve on main and stable/13.  See https://cgit.freebsd.org/src/commit/?id=7fa2335347362378322a4d27cb40f6e6cd5dd0fb , I failed to tag this PR in the commit.