Bug 235010 - bhyve: Linux guest crash due to unhandled MSR
Summary: bhyve: Linux guest crash due to unhandled MSR
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.0-RELEASE
Hardware: amd64 Any
: --- Affects Some People
Assignee: freebsd-virtualization (Nobody)
URL:
Keywords: bhyve
Depends on:
Blocks:
 
Reported: 2019-01-16 20:46 UTC by Rys Sommefeldt
Modified: 2019-07-12 22:32 UTC (History)
4 users (show)

See Also:


Attachments
Bad boot due to unhandled Spectre v2 MSR (160 bytes, text/plain)
2019-01-16 20:46 UTC, Rys Sommefeldt
no flags Details
Good boot log after using -w switch (47.90 KB, text/plain)
2019-01-16 20:47 UTC, Rys Sommefeldt
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Rys Sommefeldt 2019-01-16 20:46:17 UTC
Created attachment 201198 [details]
Bad boot due to unhandled Spectre v2 MSR

bhyve in FreeBSD 12.0-RELEASE can't run certain Linux guests on certain AMD processors due to an unhandled MSR related to the Spectre v2 mitigation present in modern Linux kernels.

Use of the -w switch lets at least certain guests run properly.

The MSR is 0x49, and the bhyve log shows the guest attempting to set it to 0x1. I've attached logs from a single vCPU configuration attempting to run Debian 9.6.0.

debian9-1vCPU.log is from a run without -w
debian9-1vCPU-ignore-unimplemented-msr.log is from a run with -w

This LKML thread has more information on how it relates to Spectre v2 mitigation:

https://lkml.org/lkml/2019/1/3/540

I don't understand the interaction between the Linux kernel and the CPU as presented to the guest by bhyve, with regards to the microcode. It's possible my host system needs a microcode update delivered by the BIOS or FreeBSD somehow, but I'm out of my depth there.

I'm able to get guests that support the Linux kernel's spectre_v2_user=off kernel boot param to boot OK, without needing -w.

Let me know if I can do more testing. I'm running an AMD Ryzen Threadripper 2990WX with AGESA firmware 1.1.0.1a as part of the BIOS.
Comment 1 Rys Sommefeldt 2019-01-16 20:47:28 UTC
Created attachment 201199 [details]
Good boot log after using -w switch
Comment 2 Conrad Meyer freebsd_committer freebsd_triage 2019-01-16 23:35:37 UTC
This seems to be a good summary:

https://lkml.org/lkml/2019/1/9/448

We might be passing through CPUID bits that we should not be to the guest, at least not without adding that MSR to our emulation list.

I'm not sure how we handle spectre/meltdown representations to guests on Intel.

I don't think guests should be able to set these MSRs and they probably shouldn't do software mitigation -- it's up to the host to correct mitigate.  So maybe we should set whatever bit claims immunity to spectre/meltdown in guest cpuid.
Comment 3 commit-hook freebsd_committer freebsd_triage 2019-01-17 19:45:46 UTC
A commit references this bug:

Author: cem
Date: Thu Jan 17 19:44:48 UTC 2019
New revision: 343120
URL: https://svnweb.freebsd.org/changeset/base/343120

Log:
  Add definitions for AMD Spectre/Meltdown CPUID information

  No functional change, aside from printing recognized bits in CPU
  identification.

  The bits are documented in 111006-B "Indirect Branch Control Extension"[1] and
  124441 "Speculative Store Bypass Disable."[2]

  Notably missing (left as future work):
    * Integration with hw.spec_store_bypass_disable and hw_ssb_active flag,
      which are currently Intel-specific
    * Integration with hw_ibrs_active global flag, which are currently
      Intel-specific
    * SSB_NO integration in hw_ssb_recalculate()
    * Bhyve integration (PR 235010)

  [1]:
  https://developer.amd.com/wp-content/resources/111006-B_AMD64TechnologyIndirectBranchControlExtenstion_WP_7-18Update_FNL.pdf

  [2]:
  https://developer.amd.com/wp-content/resources/124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf

  PR:		235010 (related, but does not fix)
  MFC after:	a week

Changes:
  head/sys/x86/include/specialreg.h
  head/sys/x86/x86/identcpu.c
Comment 4 Rodney W. Grimes freebsd_committer freebsd_triage 2019-01-18 15:06:40 UTC
(In reply to Conrad Meyer from comment #2)
Your correct in that bhyve does very little to hide CPUID bits of the host from the guest and this has caused us these types of problems.  This only gets worse with the addition of mitigation bits.

What we need is a general mechanism to deal with this that would allow both masking and setting of any of the CPUID bits, something along the lines of (HOST & mask) | force for each of the cpuid values.

This would even give us the ability to change processor model and type.

IIRC this is how vmware implements the "create least common denominator" CPUid accross a cluster of servers so that you can do live migration.
Comment 5 commit-hook freebsd_committer freebsd_triage 2019-01-18 23:55:38 UTC
A commit references this bug:

Author: cem
Date: Fri Jan 18 23:54:51 UTC 2019
New revision: 343166
URL: https://svnweb.freebsd.org/changeset/base/343166

Log:
  vmm(4): Mask Spectre feature bits on AMD hosts

  For parity with Intel hosts, which already mask out the CPUID feature
  bits that indicate the presence of the SPEC_CTRL MSR, do the same on
  AMD.

  Eventually we may want to have a better support story for guests, but
  for now, limit the damage of incorrectly indicating an MSR we do not yet
  support.

  Eventually, we may want a generic CPUID override system for
  administrators, or for minimum supported feature set in heterogenous
  environments with failover.  That is a much larger scope effort than
  this bug fix.

  PR:		235010
  Reported by:	Rys Sommefeldt <rys AT sommefeldt.com>
  Sponsored by:	Dell EMC Isilon

Changes:
  head/sys/amd64/vmm/x86.c
Comment 6 Conrad Meyer freebsd_committer freebsd_triage 2019-01-18 23:56:47 UTC
Thanks for the report and MSR access log, Rys!  I think I found the reason this works on Intel (the CPUID feature bits where SPEC_CTRL would be indicated are cleared) and not AMD (on AMD, we pass through those feature bits from the host).  Both Intel and AMD share the same SPEC_CRTL MSR and we do not implement it on either platform.  Please try the committed change, I believe it should fix the issue.
Comment 7 Rys Sommefeldt 2019-01-20 10:23:53 UTC
I patched my 12.0-RELEASE kernel with 343120 and 343166 and now modern Linux guests boot without needing -w in bhyve (ignore_bad_msr="yes" in vm-bhyve) in both uniprocessor and multiprocess vCPU guest configs on my platform.

Thanks, Conrad! I'll file a new bug if anything else shows up, and reference this one if needed.
Comment 8 commit-hook freebsd_committer freebsd_triage 2019-07-12 22:32:06 UTC
A commit references this bug:

Author: jhb
Date: Fri Jul 12 22:31:15 UTC 2019
New revision: 349958
URL: https://svnweb.freebsd.org/changeset/base/349958

Log:
  MFC 339911,339936,343075,343166,348592: Various AMD CPU-specific fixes.

  339911:
  Emulate machine check related MSR_EXTFEATURES to allow guest OSes to
  boot on AMD FX Series.

  339936:
  Merge cases with upper block.
  This is a cosmetic change only to simplify code.

  343075:
  vmm(4): Take steps towards multicore bhyve AMD support

  vmm's CPUID emulation presented Intel topology information to the guest, but
  disabled AMD topology information and in some cases passed through garbage.
  I.e., CPUID leaves 0x8000_001[de] were passed through to the guest, but
  guest CPUs can migrate between host threads, so the information presented
  was not consistent.  This could easily be observed with 'cpucontrol -i 0xfoo
  /dev/cpuctl0'.

  Slightly improve this situation by enabling the AMD topology feature flag
  and presenting at least the CPUID fields used by FreeBSD itself to probe
  topology on more modern AMD64 hardware (Family 15h+).  Older stuff is
  probably less interesting.  I have not been able to empirically confirm it
  is sufficient, but it should not regress anything either.

  343166:
  vmm(4): Mask Spectre feature bits on AMD hosts

  For parity with Intel hosts, which already mask out the CPUID feature
  bits that indicate the presence of the SPEC_CTRL MSR, do the same on
  AMD.

  Eventually we may want to have a better support story for guests, but
  for now, limit the damage of incorrectly indicating an MSR we do not yet
  support.

  Eventually, we may want a generic CPUID override system for
  administrators, or for minimum supported feature set in heterogenous
  environments with failover.  That is a much larger scope effort than
  this bug fix.

  348592:
  Emulate the AMD MSR_LS_CFG MSR used for various Ryzen errata.

  Writes are ignored and reads always return zero.

  PR:		224476, 235010

Changes:
_U  stable/11/
  stable/11/sys/amd64/vmm/amd/svm_msr.c
  stable/11/sys/amd64/vmm/x86.c
  stable/11/sys/amd64/vmm/x86.h
  stable/11/sys/x86/x86/mp_x86.c
  stable/11/usr.sbin/bhyve/xmsr.c
_U  stable/12/
  stable/12/sys/amd64/vmm/amd/svm_msr.c
  stable/12/sys/amd64/vmm/x86.c
  stable/12/sys/amd64/vmm/x86.h
  stable/12/sys/x86/x86/mp_x86.c
  stable/12/usr.sbin/bhyve/xmsr.c