Created attachment 187086 [details] /etc/rc.d/pciptdetach: Work around RAM corruption at guest shutdown Various different panics will happen after shutting down a bhyve guest with PCIe passthrough NICs if some conditions are met. This can lead to completely destroyed zpools, like I found out after some dozend not so impacting crashes... Quoting jhb@: I suspect what is happening is that the PCI devices are still issuing DMAs after the guest has been shutdown which end up trashing other parts of host memory. This may somewhat be my fault as I made a change which moves the device back into the host domain after FLR during guest shutdown. I should perhaps leave the device disabled in the DMAR table instead if the FLR doesn't succeed. (We could also add some other forms of reset for devices not supporting FLR.) </quote> Since I don't have the skills to help fixing the root cause, I wrote a little workaround in form of a rc(8) script (to be copied to /etc/rc.d) which should protect against accidental crashes and data losses, by bringing the PciPassThrough devices down before shutting down, which prevents DMA writes from the card after moving it back into host domain. -harry
Qick update: jhb@ seems to have fixed the root cause. I'm currently testing the fix and already reported him that I couldn't reproduce the issue anymore. I'm confident that the fix will be commited very soon, so the script attached becomes obsolete very soon too.
A commit references this bug: Author: jhb Date: Fri Oct 27 14:57:15 UTC 2017 New revision: 325039 URL: https://svnweb.freebsd.org/changeset/base/325039 Log: Rework pass through changes in r305485 to be safer. Specifically, devices that do not support PCI-e FLR and were not gracefully shutdown by the guest OS could continue to issue DMA requests after the VM was terminated. The changes in r305485 meant that those DMA requests were completed against the host's memory which could result in random memory corruption. Instead, leave ppt devices that are not attached to a VM disabled in the IOMMU and only restore the devices to the host domain if the ppt(4) driver is detached from a device. As an added safety belt, disable busmastering for a pass-through device when before adding it to the host domain during ppt(4) detach. PR: 222937 Tested by: Harry Schmalzbauer <freebsd@omnilan.de> Reviewed by: grehan MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D12661 Changes: head/sys/amd64/vmm/io/iommu.c head/sys/amd64/vmm/io/ppt.c
Thanks a lot John! Since my tests were all done on stable/11, I strongly vote for MFC – which the commit log already shows with 1 week timeframe; just to respond to the last comment... Like reported, brave bhyve users have their data in danger ;-) -harry
Harald: could you please test the following change to confirm that it doesn't introduce any regressions? https://github.com/mattmacy/networking/commit/5a12497038b11b55986efc81ffb2f211dec6077a -M
A commit references this bug: Author: jhb Date: Thu Nov 16 18:22:03 UTC 2017 New revision: 325900 URL: https://svnweb.freebsd.org/changeset/base/325900 Log: MFC 325039: Rework pass through changes in r305485 to be safer. Specifically, devices that do not support PCI-e FLR and were not gracefully shutdown by the guest OS could continue to issue DMA requests after the VM was terminated. The changes in r305485 meant that those DMA requests were completed against the host's memory which could result in random memory corruption. Instead, leave ppt devices that are not attached to a VM disabled in the IOMMU and only restore the devices to the host domain if the ppt(4) driver is detached from a device. As an added safety belt, disable busmastering for a pass-through device when before adding it to the host domain during ppt(4) detach. PR: 222937 Changes: _U stable/10/ stable/10/sys/amd64/vmm/io/iommu.c stable/10/sys/amd64/vmm/io/ppt.c _U stable/11/ stable/11/sys/amd64/vmm/io/iommu.c stable/11/sys/amd64/vmm/io/ppt.c