| Summary: | [kvm] FreeBSD 10 crashes as KVM guest on GNU/Linux on AMD family 10h CPUs | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Base System | Reporter: | Simon Matter <simon.matter> | ||||||||
| Component: | amd64 | Assignee: | Alan Cox <alc> | ||||||||
| Status: | Closed FIXED | ||||||||||
| Severity: | Affects Only Me | ||||||||||
| Priority: | Normal | ||||||||||
| Version: | 10.0-RELEASE | ||||||||||
| Hardware: | Any | ||||||||||
| OS: | Any | ||||||||||
| Attachments: |
|
||||||||||
|
Description
Simon Matter
2014-01-23 23:10:01 UTC
Hi, After thinking about it again it seems the proposed solution may not be enough. At least KVM allows to migrate guests from an Intel to an AMD processor. That means in case of running as a vm guest, it's required to always enable "AMD Erratum 383" workaround. Otherwise, after migration to an affected AMD Family 10h processor, the guest could triggered AMD Erratum 383. I've tried to implement this and attached patch fixes the problem for me. Would me nice if someone with more experience than me could have a look at it. Thanks, Simon As noted by John Baldwin the change to mca.c is not needed. Attached patch is what I'm using now with success. BTW: setting vm.pmap.pg_ps_enabled="0" in loader.conf does also mitigate the issue but I guess it's not the optimal solution. Regards, Simon Can you please verify that the attached patch addresses the problem for you? Aside from addressing the crash, the objective of this patch is avoid enabling the workaround for perpetuity on all past Intel and future AMD/Intel cores on account of one broken AMD core. The systems that I've seen for VM migration look at the reported processor feature sets and only migrate among machines with like feature sets. So, if the feature set includes at least one feature that is not supported by AMD Family 10h or earlier AMD cores, then we shouldn't need to enable the workaround. Responsible Changed From-To: freebsd-amd64->alc Take over this PR. > Can you please verify that the attached patch addresses the problem for
> you? Aside from addressing the crash, the objective of this patch is
> avoid enabling the workaround for perpetuity on all past Intel and
> future AMD/Intel cores on account of one broken AMD core. The systems
> that I've seen for VM migration look at the reported processor feature
> sets and only migrate among machines with like feature sets. So, if the
> feature set includes at least one feature that is not supported by AMD
> Family 10h or earlier AMD cores, then we shouldn't need to enable the
> workaround.
I can confirm that the patch works on my test system by enabling the AMD
erratum 383 workaround. But, AFAIK it means that the workaround will be
enabled for most KVM setups. That's because, by default, KVM uses a low
CPU feature set and a low cpu family (6) to be able to migrate a VM in
almost every case. Therefore it would make sense to have a sysctl option
to override the auto detected setting so that those who know what they are
doing can force the state of the erratum 383 workaround.
Regards,
Simon
> Can you please verify that the attached patch addresses the problem for
> you? Aside from addressing the crash, the objective of this patch is
> avoid enabling the workaround for perpetuity on all past Intel and
> future AMD/Intel cores on account of one broken AMD core. The systems
> that I've seen for VM migration look at the reported processor feature
> sets and only migrate among machines with like feature sets. So, if the
> feature set includes at least one feature that is not supported by AMD
> Family 10h or earlier AMD cores, then we shouldn't need to enable the
> workaround.
What about keeping the whole CPU detection as it is and just make
erratum383 a tunable sysctl so that those who run an affected
configuration can enable it? I've tried it like below and put
hw.mca.erratum383="1" into loader.conf.
--- sys/x86/x86/mca.c.orig 2014-01-16 21:35:03.000000000 +0100
+++ sys/x86/x86/mca.c 2014-02-18 11:34:07.189148894 +0100
@@ -100,9 +100,10 @@
SYSCTL_INT(_hw_mca, OID_AUTO, amd10h_L1TP, CTLFLAG_RDTUN, &amd10h_L1TP, 0,
"Administrative toggle for logging of level one TLB parity (L1TP)
errors");
-int workaround_erratum383;
-SYSCTL_INT(_hw_mca, OID_AUTO, erratum383, CTLFLAG_RD,
&workaround_erratum383, 0,
- "Is the workaround for Erratum 383 on AMD Family 10h processors
enabled?");
+int workaround_erratum383 = 0;
+TUNABLE_INT("hw.mca.erratum383", &workaround_erratum383);
+SYSCTL_INT(_hw_mca, OID_AUTO, erratum383, CTLFLAG_RDTUN,
&workaround_erratum383, 0,
+ "Administrative toggle for enabling workaround for Erratum 383 on AMD
Family 10h processors");
static STAILQ_HEAD(, mca_internal) mca_freelist;
static int mca_freecount;
On 02/18/2014 07:33, Simon Matter wrote: >> Can you please verify that the attached patch addresses the problem for >> you? Aside from addressing the crash, the objective of this patch is >> avoid enabling the workaround for perpetuity on all past Intel and >> future AMD/Intel cores on account of one broken AMD core. The systems >> that I've seen for VM migration look at the reported processor feature >> sets and only migrate among machines with like feature sets. So, if the >> feature set includes at least one feature that is not supported by AMD >> Family 10h or earlier AMD cores, then we shouldn't need to enable the >> workaround. > What about keeping the whole CPU detection as it is and just make > erratum383 a tunable sysctl so that those who run an affected > configuration can enable it? I've tried it like below and put > hw.mca.erratum383="1" into loader.conf. I would prefer to do the opposite, that is, automatically enable the workaround if the kernel can't be prove that the underlying processor is unaffected and provide a tunable for blocking the activation of the workaround in pmap.c. > --- sys/x86/x86/mca.c.orig 2014-01-16 21:35:03.000000000 +0100 > +++ sys/x86/x86/mca.c 2014-02-18 11:34:07.189148894 +0100 > @@ -100,9 +100,10 @@ > SYSCTL_INT(_hw_mca, OID_AUTO, amd10h_L1TP, CTLFLAG_RDTUN, &amd10h_L1TP, 0, > "Administrative toggle for logging of level one TLB parity (L1TP) > errors"); > > -int workaround_erratum383; > -SYSCTL_INT(_hw_mca, OID_AUTO, erratum383, CTLFLAG_RD, > &workaround_erratum383, 0, > - "Is the workaround for Erratum 383 on AMD Family 10h processors > enabled?"); > +int workaround_erratum383 = 0; > +TUNABLE_INT("hw.mca.erratum383", &workaround_erratum383); > +SYSCTL_INT(_hw_mca, OID_AUTO, erratum383, CTLFLAG_RDTUN, > &workaround_erratum383, 0, > + "Administrative toggle for enabling workaround for Erratum 383 on AMD > Family 10h processors"); > > static STAILQ_HEAD(, mca_internal) mca_freelist; > static int mca_freecount; > > > On 02/17/2014 17:25, Simon Matter wrote:
>> Can you please verify that the attached patch addresses the problem for
>> you? Aside from addressing the crash, the objective of this patch is
>> avoid enabling the workaround for perpetuity on all past Intel and
>> future AMD/Intel cores on account of one broken AMD core. The systems
>> that I've seen for VM migration look at the reported processor feature
>> sets and only migrate among machines with like feature sets. So, if the
>> feature set includes at least one feature that is not supported by AMD
>> Family 10h or earlier AMD cores, then we shouldn't need to enable the
>> workaround.
> I can confirm that the patch works on my test system by enabling the AMD
> erratum 383 workaround. But, AFAIK it means that the workaround will be
> enabled for most KVM setups. That's because, by default, KVM uses a low
> CPU feature set and a low cpu family (6) to be able to migrate a VM in
> almost every case. Therefore it would make sense to have a sysctl option
> to override the auto detected setting so that those who know what they are
> doing can force the state of the erratum 383 workaround.
>
If you're restricted to Family 6 features, then you're losing some
features that will have more impact on performance than the overhead of
this workaround. For example, in just the pmap, you're losing support
for 1 GB pages, which are used to implement the direct map, and you're
losing the population count instruction, which sees moderate use on one
of the pmap data structures. So, I don't think it's worth expending too
much effort on avoiding the workaround on older processors.
Author: alc Date: Sat Feb 22 18:53:42 2014 New Revision: 262338 URL: http://svnweb.freebsd.org/changeset/base/262338 Log: When the kernel is running in a virtual machine, it cannot rely upon the processor family to determine if the workaround for AMD Family 10h Erratum 383 should be enabled. To enable virtual machine migration among a heterogeneous collection of physical machines, the hypervisor may have been configured to report an older processor family with a reduced feature set. Effectively, the reported processor family and its features are like a "least common denominator" for the collection of machines. Therefore, when the kernel is running in a virtual machine, instead of relying upon the processor family, we now test for features that prove that the underlying processor is not affected by the erratum. (The features that we test for are unlikely to ever be emulated in software on an affected physical processor.) PR: 186061 Tested by: Simon Matter Discussed with: jhb, neel MFC after: 2 weeks Modified: head/sys/amd64/amd64/pmap.c head/sys/i386/i386/pmap.c Modified: head/sys/amd64/amd64/pmap.c ============================================================================== --- head/sys/amd64/amd64/pmap.c Sat Feb 22 17:51:10 2014 (r262337) +++ head/sys/amd64/amd64/pmap.c Sat Feb 22 18:53:42 2014 (r262338) @@ -1005,12 +1005,18 @@ pmap_init(void) } /* - * If the kernel is running in a virtual machine on an AMD Family 10h - * processor, then it must assume that MCA is enabled by the virtual - * machine monitor. - */ - if (vm_guest == VM_GUEST_VM && cpu_vendor_id == CPU_VENDOR_AMD && - CPUID_TO_FAMILY(cpu_id) == 0x10) + * If the kernel is running on a virtual machine, then it must assume + * that MCA is enabled by the hypervisor. Moreover, the kernel must + * be prepared for the hypervisor changing the vendor and family that + * are reported by CPUID. Consequently, the workaround for AMD Family + * 10h Erratum 383 is enabled if the processor's feature set does not + * include at least one feature that is only supported by older Intel + * or newer AMD processors. + */ + if (vm_guest == VM_GUEST_VM && (cpu_feature & CPUID_SS) == 0 && + (cpu_feature2 & (CPUID2_SSSE3 | CPUID2_SSE41 | CPUID2_AESNI | + CPUID2_AVX | CPUID2_XSAVE)) == 0 && (amd_feature2 & (AMDID2_XOP | + AMDID2_FMA4)) == 0) workaround_erratum383 = 1; /* Modified: head/sys/i386/i386/pmap.c ============================================================================== --- head/sys/i386/i386/pmap.c Sat Feb 22 17:51:10 2014 (r262337) +++ head/sys/i386/i386/pmap.c Sat Feb 22 18:53:42 2014 (r262338) @@ -750,12 +750,18 @@ pmap_init(void) pv_entry_high_water = 9 * (pv_entry_max / 10); /* - * If the kernel is running in a virtual machine on an AMD Family 10h - * processor, then it must assume that MCA is enabled by the virtual - * machine monitor. - */ - if (vm_guest == VM_GUEST_VM && cpu_vendor_id == CPU_VENDOR_AMD && - CPUID_TO_FAMILY(cpu_id) == 0x10) + * If the kernel is running on a virtual machine, then it must assume + * that MCA is enabled by the hypervisor. Moreover, the kernel must + * be prepared for the hypervisor changing the vendor and family that + * are reported by CPUID. Consequently, the workaround for AMD Family + * 10h Erratum 383 is enabled if the processor's feature set does not + * include at least one feature that is only supported by older Intel + * or newer AMD processors. + */ + if (vm_guest == VM_GUEST_VM && (cpu_feature & CPUID_SS) == 0 && + (cpu_feature2 & (CPUID2_SSSE3 | CPUID2_SSE41 | CPUID2_AESNI | + CPUID2_AVX | CPUID2_XSAVE)) == 0 && (amd_feature2 & (AMDID2_XOP | + AMDID2_FMA4)) == 0) workaround_erratum383 = 1; /* _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" Any reason, this did not get MFC'd after 2 weeks? I'd really appreciate seeing the fix in 10-STABLE, maybe even as bugfix before 10.1-R via freebsd-update Author: alc Date: Wed May 7 00:32:49 2014 New Revision: 265476 URL: http://svnweb.freebsd.org/changeset/base/265476 Log: MFC r262338 When the kernel is running in a virtual machine, it cannot rely upon the processor family to determine if the workaround for AMD Family 10h Erratum 383 should be enabled. To enable virtual machine migration among a heterogeneous collection of physical machines, the hypervisor may have been configured to report an older processor family with a reduced feature set. Effectively, the reported processor family and its features are like a "least common denominator" for the collection of machines. Therefore, when the kernel is running in a virtual machine, instead of relying upon the processor family, we now test for features that prove that the underlying processor is not affected by the erratum. (The features that we test for are unlikely to ever be emulated in software on an affected physical processor.) PR: 186061 Modified: stable/10/sys/amd64/amd64/pmap.c stable/10/sys/i386/i386/pmap.c Directory Properties: stable/10/ (props changed) Modified: stable/10/sys/amd64/amd64/pmap.c ============================================================================== --- stable/10/sys/amd64/amd64/pmap.c Tue May 6 23:28:48 2014 (r265475) +++ stable/10/sys/amd64/amd64/pmap.c Wed May 7 00:32:49 2014 (r265476) @@ -1008,12 +1008,18 @@ pmap_init(void) } /* - * If the kernel is running in a virtual machine on an AMD Family 10h - * processor, then it must assume that MCA is enabled by the virtual - * machine monitor. - */ - if (vm_guest == VM_GUEST_VM && cpu_vendor_id == CPU_VENDOR_AMD && - CPUID_TO_FAMILY(cpu_id) == 0x10) + * If the kernel is running on a virtual machine, then it must assume + * that MCA is enabled by the hypervisor. Moreover, the kernel must + * be prepared for the hypervisor changing the vendor and family that + * are reported by CPUID. Consequently, the workaround for AMD Family + * 10h Erratum 383 is enabled if the processor's feature set does not + * include at least one feature that is only supported by older Intel + * or newer AMD processors. + */ + if (vm_guest == VM_GUEST_VM && (cpu_feature & CPUID_SS) == 0 && + (cpu_feature2 & (CPUID2_SSSE3 | CPUID2_SSE41 | CPUID2_AESNI | + CPUID2_AVX | CPUID2_XSAVE)) == 0 && (amd_feature2 & (AMDID2_XOP | + AMDID2_FMA4)) == 0) workaround_erratum383 = 1; /* Modified: stable/10/sys/i386/i386/pmap.c ============================================================================== --- stable/10/sys/i386/i386/pmap.c Tue May 6 23:28:48 2014 (r265475) +++ stable/10/sys/i386/i386/pmap.c Wed May 7 00:32:49 2014 (r265476) @@ -752,12 +752,18 @@ pmap_init(void) pv_entry_high_water = 9 * (pv_entry_max / 10); /* - * If the kernel is running in a virtual machine on an AMD Family 10h - * processor, then it must assume that MCA is enabled by the virtual - * machine monitor. - */ - if (vm_guest == VM_GUEST_VM && cpu_vendor_id == CPU_VENDOR_AMD && - CPUID_TO_FAMILY(cpu_id) == 0x10) + * If the kernel is running on a virtual machine, then it must assume + * that MCA is enabled by the hypervisor. Moreover, the kernel must + * be prepared for the hypervisor changing the vendor and family that + * are reported by CPUID. Consequently, the workaround for AMD Family + * 10h Erratum 383 is enabled if the processor's feature set does not + * include at least one feature that is only supported by older Intel + * or newer AMD processors. + */ + if (vm_guest == VM_GUEST_VM && (cpu_feature & CPUID_SS) == 0 && + (cpu_feature2 & (CPUID2_SSSE3 | CPUID2_SSE41 | CPUID2_AESNI | + CPUID2_AVX | CPUID2_XSAVE)) == 0 && (amd_feature2 & (AMDID2_XOP | + AMDID2_FMA4)) == 0) workaround_erratum383 = 1; /* _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" Author: alc Date: Wed May 7 15:52:41 2014 New Revision: 265554 URL: http://svnweb.freebsd.org/changeset/base/265554 Log: MFC r262338 When the kernel is running in a virtual machine, it cannot rely upon the processor family to determine if the workaround for AMD Family 10h Erratum 383 should be enabled. To enable virtual machine migration among a heterogeneous collection of physical machines, the hypervisor may have been configured to report an older processor family with a reduced feature set. Effectively, the reported processor family and its features are like a "least common denominator" for the collection of machines. Therefore, when the kernel is running in a virtual machine, instead of relying upon the processor family, we now test for features that prove that the underlying processor is not affected by the erratum. (The features that we test for are unlikely to ever be emulated in software on an affected physical processor.) PR: 186061 Modified: stable/9/sys/amd64/amd64/pmap.c stable/9/sys/i386/i386/pmap.c Directory Properties: stable/9/sys/ (props changed) Modified: stable/9/sys/amd64/amd64/pmap.c ============================================================================== --- stable/9/sys/amd64/amd64/pmap.c Wed May 7 15:34:04 2014 (r265553) +++ stable/9/sys/amd64/amd64/pmap.c Wed May 7 15:52:41 2014 (r265554) @@ -797,12 +797,18 @@ pmap_init(void) } /* - * If the kernel is running in a virtual machine on an AMD Family 10h - * processor, then it must assume that MCA is enabled by the virtual - * machine monitor. - */ - if (vm_guest == VM_GUEST_VM && cpu_vendor_id == CPU_VENDOR_AMD && - CPUID_TO_FAMILY(cpu_id) == 0x10) + * If the kernel is running on a virtual machine, then it must assume + * that MCA is enabled by the hypervisor. Moreover, the kernel must + * be prepared for the hypervisor changing the vendor and family that + * are reported by CPUID. Consequently, the workaround for AMD Family + * 10h Erratum 383 is enabled if the processor's feature set does not + * include at least one feature that is only supported by older Intel + * or newer AMD processors. + */ + if (vm_guest == VM_GUEST_VM && (cpu_feature & CPUID_SS) == 0 && + (cpu_feature2 & (CPUID2_SSSE3 | CPUID2_SSE41 | CPUID2_AESNI | + CPUID2_AVX | CPUID2_XSAVE)) == 0 && (amd_feature2 & (AMDID2_XOP | + AMDID2_FMA4)) == 0) workaround_erratum383 = 1; /* Modified: stable/9/sys/i386/i386/pmap.c ============================================================================== --- stable/9/sys/i386/i386/pmap.c Wed May 7 15:34:04 2014 (r265553) +++ stable/9/sys/i386/i386/pmap.c Wed May 7 15:52:41 2014 (r265554) @@ -768,12 +768,18 @@ pmap_init(void) pv_entry_high_water = 9 * (pv_entry_max / 10); /* - * If the kernel is running in a virtual machine on an AMD Family 10h - * processor, then it must assume that MCA is enabled by the virtual - * machine monitor. - */ - if (vm_guest == VM_GUEST_VM && cpu_vendor_id == CPU_VENDOR_AMD && - CPUID_TO_FAMILY(cpu_id) == 0x10) + * If the kernel is running on a virtual machine, then it must assume + * that MCA is enabled by the hypervisor. Moreover, the kernel must + * be prepared for the hypervisor changing the vendor and family that + * are reported by CPUID. Consequently, the workaround for AMD Family + * 10h Erratum 383 is enabled if the processor's feature set does not + * include at least one feature that is only supported by older Intel + * or newer AMD processors. + */ + if (vm_guest == VM_GUEST_VM && (cpu_feature & CPUID_SS) == 0 && + (cpu_feature2 & (CPUID2_SSSE3 | CPUID2_SSE41 | CPUID2_AESNI | + CPUID2_AVX | CPUID2_XSAVE)) == 0 && (amd_feature2 & (AMDID2_XOP | + AMDID2_FMA4)) == 0) workaround_erratum383 = 1; /* _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" Author: alc Date: Wed May 7 16:28:36 2014 New Revision: 265556 URL: http://svnweb.freebsd.org/changeset/base/265556 Log: MFC r262338 When the kernel is running in a virtual machine, it cannot rely upon the processor family to determine if the workaround for AMD Family 10h Erratum 383 should be enabled. To enable virtual machine migration among a heterogeneous collection of physical machines, the hypervisor may have been configured to report an older processor family with a reduced feature set. Effectively, the reported processor family and its features are like a "least common denominator" for the collection of machines. Therefore, when the kernel is running in a virtual machine, instead of relying upon the processor family, we now test for features that prove that the underlying processor is not affected by the erratum. (The features that we test for are unlikely to ever be emulated in software on an affected physical processor.) PR: 186061 Modified: stable/8/sys/amd64/amd64/pmap.c stable/8/sys/i386/i386/pmap.c Directory Properties: stable/8/sys/ (props changed) stable/8/sys/amd64/ (props changed) stable/8/sys/i386/ (props changed) Modified: stable/8/sys/amd64/amd64/pmap.c ============================================================================== --- stable/8/sys/amd64/amd64/pmap.c Wed May 7 16:16:49 2014 (r265555) +++ stable/8/sys/amd64/amd64/pmap.c Wed May 7 16:28:36 2014 (r265556) @@ -706,12 +706,18 @@ pmap_init(void) } /* - * If the kernel is running in a virtual machine on an AMD Family 10h - * processor, then it must assume that MCA is enabled by the virtual - * machine monitor. - */ - if (vm_guest == VM_GUEST_VM && cpu_vendor_id == CPU_VENDOR_AMD && - CPUID_TO_FAMILY(cpu_id) == 0x10) + * If the kernel is running on a virtual machine, then it must assume + * that MCA is enabled by the hypervisor. Moreover, the kernel must + * be prepared for the hypervisor changing the vendor and family that + * are reported by CPUID. Consequently, the workaround for AMD Family + * 10h Erratum 383 is enabled if the processor's feature set does not + * include at least one feature that is only supported by older Intel + * or newer AMD processors. + */ + if (vm_guest == VM_GUEST_VM && (cpu_feature & CPUID_SS) == 0 && + (cpu_feature2 & (CPUID2_SSSE3 | CPUID2_SSE41 | CPUID2_AESNI | + CPUID2_AVX | CPUID2_XSAVE)) == 0 && (amd_feature2 & (AMDID2_XOP | + AMDID2_FMA4)) == 0) workaround_erratum383 = 1; /* Modified: stable/8/sys/i386/i386/pmap.c ============================================================================== --- stable/8/sys/i386/i386/pmap.c Wed May 7 16:16:49 2014 (r265555) +++ stable/8/sys/i386/i386/pmap.c Wed May 7 16:28:36 2014 (r265556) @@ -757,12 +757,18 @@ pmap_init(void) pv_entry_high_water = 9 * (pv_entry_max / 10); /* - * If the kernel is running in a virtual machine on an AMD Family 10h - * processor, then it must assume that MCA is enabled by the virtual - * machine monitor. - */ - if (vm_guest == VM_GUEST_VM && cpu_vendor_id == CPU_VENDOR_AMD && - CPUID_TO_FAMILY(cpu_id) == 0x10) + * If the kernel is running on a virtual machine, then it must assume + * that MCA is enabled by the hypervisor. Moreover, the kernel must + * be prepared for the hypervisor changing the vendor and family that + * are reported by CPUID. Consequently, the workaround for AMD Family + * 10h Erratum 383 is enabled if the processor's feature set does not + * include at least one feature that is only supported by older Intel + * or newer AMD processors. + */ + if (vm_guest == VM_GUEST_VM && (cpu_feature & CPUID_SS) == 0 && + (cpu_feature2 & (CPUID2_SSSE3 | CPUID2_SSE41 | CPUID2_AESNI | + CPUID2_AVX | CPUID2_XSAVE)) == 0 && (amd_feature2 & (AMDID2_XOP | + AMDID2_FMA4)) == 0) workaround_erratum383 = 1; /* _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" |