Created attachment 201198 [details] Bad boot due to unhandled Spectre v2 MSR bhyve in FreeBSD 12.0-RELEASE can't run certain Linux guests on certain AMD processors due to an unhandled MSR related to the Spectre v2 mitigation present in modern Linux kernels. Use of the -w switch lets at least certain guests run properly. The MSR is 0x49, and the bhyve log shows the guest attempting to set it to 0x1. I've attached logs from a single vCPU configuration attempting to run Debian 9.6.0. debian9-1vCPU.log is from a run without -w debian9-1vCPU-ignore-unimplemented-msr.log is from a run with -w This LKML thread has more information on how it relates to Spectre v2 mitigation: https://lkml.org/lkml/2019/1/3/540 I don't understand the interaction between the Linux kernel and the CPU as presented to the guest by bhyve, with regards to the microcode. It's possible my host system needs a microcode update delivered by the BIOS or FreeBSD somehow, but I'm out of my depth there. I'm able to get guests that support the Linux kernel's spectre_v2_user=off kernel boot param to boot OK, without needing -w. Let me know if I can do more testing. I'm running an AMD Ryzen Threadripper 2990WX with AGESA firmware 1.1.0.1a as part of the BIOS.
Created attachment 201199 [details] Good boot log after using -w switch
This seems to be a good summary: https://lkml.org/lkml/2019/1/9/448 We might be passing through CPUID bits that we should not be to the guest, at least not without adding that MSR to our emulation list. I'm not sure how we handle spectre/meltdown representations to guests on Intel. I don't think guests should be able to set these MSRs and they probably shouldn't do software mitigation -- it's up to the host to correct mitigate. So maybe we should set whatever bit claims immunity to spectre/meltdown in guest cpuid.
A commit references this bug: Author: cem Date: Thu Jan 17 19:44:48 UTC 2019 New revision: 343120 URL: https://svnweb.freebsd.org/changeset/base/343120 Log: Add definitions for AMD Spectre/Meltdown CPUID information No functional change, aside from printing recognized bits in CPU identification. The bits are documented in 111006-B "Indirect Branch Control Extension"[1] and 124441 "Speculative Store Bypass Disable."[2] Notably missing (left as future work): * Integration with hw.spec_store_bypass_disable and hw_ssb_active flag, which are currently Intel-specific * Integration with hw_ibrs_active global flag, which are currently Intel-specific * SSB_NO integration in hw_ssb_recalculate() * Bhyve integration (PR 235010) [1]: https://developer.amd.com/wp-content/resources/111006-B_AMD64TechnologyIndirectBranchControlExtenstion_WP_7-18Update_FNL.pdf [2]: https://developer.amd.com/wp-content/resources/124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf PR: 235010 (related, but does not fix) MFC after: a week Changes: head/sys/x86/include/specialreg.h head/sys/x86/x86/identcpu.c
(In reply to Conrad Meyer from comment #2) Your correct in that bhyve does very little to hide CPUID bits of the host from the guest and this has caused us these types of problems. This only gets worse with the addition of mitigation bits. What we need is a general mechanism to deal with this that would allow both masking and setting of any of the CPUID bits, something along the lines of (HOST & mask) | force for each of the cpuid values. This would even give us the ability to change processor model and type. IIRC this is how vmware implements the "create least common denominator" CPUid accross a cluster of servers so that you can do live migration.
A commit references this bug: Author: cem Date: Fri Jan 18 23:54:51 UTC 2019 New revision: 343166 URL: https://svnweb.freebsd.org/changeset/base/343166 Log: vmm(4): Mask Spectre feature bits on AMD hosts For parity with Intel hosts, which already mask out the CPUID feature bits that indicate the presence of the SPEC_CTRL MSR, do the same on AMD. Eventually we may want to have a better support story for guests, but for now, limit the damage of incorrectly indicating an MSR we do not yet support. Eventually, we may want a generic CPUID override system for administrators, or for minimum supported feature set in heterogenous environments with failover. That is a much larger scope effort than this bug fix. PR: 235010 Reported by: Rys Sommefeldt <rys AT sommefeldt.com> Sponsored by: Dell EMC Isilon Changes: head/sys/amd64/vmm/x86.c
Thanks for the report and MSR access log, Rys! I think I found the reason this works on Intel (the CPUID feature bits where SPEC_CTRL would be indicated are cleared) and not AMD (on AMD, we pass through those feature bits from the host). Both Intel and AMD share the same SPEC_CRTL MSR and we do not implement it on either platform. Please try the committed change, I believe it should fix the issue.
I patched my 12.0-RELEASE kernel with 343120 and 343166 and now modern Linux guests boot without needing -w in bhyve (ignore_bad_msr="yes" in vm-bhyve) in both uniprocessor and multiprocess vCPU guest configs on my platform. Thanks, Conrad! I'll file a new bug if anything else shows up, and reference this one if needed.
A commit references this bug: Author: jhb Date: Fri Jul 12 22:31:15 UTC 2019 New revision: 349958 URL: https://svnweb.freebsd.org/changeset/base/349958 Log: MFC 339911,339936,343075,343166,348592: Various AMD CPU-specific fixes. 339911: Emulate machine check related MSR_EXTFEATURES to allow guest OSes to boot on AMD FX Series. 339936: Merge cases with upper block. This is a cosmetic change only to simplify code. 343075: vmm(4): Take steps towards multicore bhyve AMD support vmm's CPUID emulation presented Intel topology information to the guest, but disabled AMD topology information and in some cases passed through garbage. I.e., CPUID leaves 0x8000_001[de] were passed through to the guest, but guest CPUs can migrate between host threads, so the information presented was not consistent. This could easily be observed with 'cpucontrol -i 0xfoo /dev/cpuctl0'. Slightly improve this situation by enabling the AMD topology feature flag and presenting at least the CPUID fields used by FreeBSD itself to probe topology on more modern AMD64 hardware (Family 15h+). Older stuff is probably less interesting. I have not been able to empirically confirm it is sufficient, but it should not regress anything either. 343166: vmm(4): Mask Spectre feature bits on AMD hosts For parity with Intel hosts, which already mask out the CPUID feature bits that indicate the presence of the SPEC_CTRL MSR, do the same on AMD. Eventually we may want to have a better support story for guests, but for now, limit the damage of incorrectly indicating an MSR we do not yet support. Eventually, we may want a generic CPUID override system for administrators, or for minimum supported feature set in heterogenous environments with failover. That is a much larger scope effort than this bug fix. 348592: Emulate the AMD MSR_LS_CFG MSR used for various Ryzen errata. Writes are ignored and reads always return zero. PR: 224476, 235010 Changes: _U stable/11/ stable/11/sys/amd64/vmm/amd/svm_msr.c stable/11/sys/amd64/vmm/x86.c stable/11/sys/amd64/vmm/x86.h stable/11/sys/x86/x86/mp_x86.c stable/11/usr.sbin/bhyve/xmsr.c _U stable/12/ stable/12/sys/amd64/vmm/amd/svm_msr.c stable/12/sys/amd64/vmm/x86.c stable/12/sys/amd64/vmm/x86.h stable/12/sys/x86/x86/mp_x86.c stable/12/usr.sbin/bhyve/xmsr.c