Summary: | cpucontrol uses unsafe procedure to detect current microcode version | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Henrique de Moraes Holschuh <hmh+freebsd> | ||||||
Component: | bin | Assignee: | freebsd-bugs (Nobody) <bugs> | ||||||
Status: | Closed FIXED | ||||||||
Severity: | Affects Many People | CC: | chris, dan+freebsd.org, eadler, emaste, gonzo, mhorne063, moon300web, op, sblachmann | ||||||
Priority: | --- | ||||||||
Version: | CURRENT | ||||||||
Hardware: | Any | ||||||||
OS: | Any | ||||||||
Attachments: |
|
Description
Henrique de Moraes Holschuh
2014-08-07 19:07:15 UTC
(In reply to Henrique de Moraes Holschuh from comment #0) > the impact of getting garbage from MSR_BIOS_SIGN includes ... > ... incorrectly refusing to upgrade microcode. This is confirmed. As the cpucontrol program was paid work sponsored by the FreeBSD foundation, it seems to be sacrosanct in spite of being spaghetti. Maybe this is the reason apparently nobody dares to touch it? Or even replace its identification and updating code core with a better program like cpupdate? (see below) Anyway, the fix for this issue would be to add this between line 117 and 118 in intel.c: msrargs.msr = MSR_BIOS_SIGN; msrargs.data = 0; error = ioctl( cpufd, CPUCTL_WRMSR, &msrargs); // note: cpucontrol's error messages are stupid and // do not help the user find what is wrong. if (error < 0) { WARN(0, "ioctl(%s)", dev); goto fail; } The other major issue, usage of undefined reserved bits, can lead to similar problems. (Hint: see lines 248 and 254 - there is missing some bitwise ANDing) As my opinion is that this spaghetti code should be scrapped instead of being fixed, I am not eager to take the effort to look into the Intel programmers' manual to tell you what would belong there. I wrote cpupdate, which does not have these issues, and works with the new composite Intel microcode file format. This will be the standard format after Intel dropped the legacy microcode.dat and .fw format. If you want me to rework and tidy up cpupdate so it follows FreeBSD kernel coding guidelines and add some functions to make it replace the buggy/obsolete sections of cpucontrol, for including it in FreeBSD or HardenedBSD base, I would be willing to invest time into that. Links: https://github.com/kernschmelze/cpupdate https://forums.freebsd.org/threads/introducing-cpupdate-a-microcode-tool-for-freebsd.64588 https://groups.google.com/forum/#!topic/mpc.lists.freebsd.hackers/JTtw6TNQqng This procedure is not handled by either the cpucontrol or cpupdate utilities but instead by the underlying driver cpuctl. Inspection of the cpuctl code shows that this is being handled properly in the update_intel() function. sys/dev/cpuctl/cpuctl.c starting at line 366: critical_enter(); rdmsr_safe(MSR_BIOS_SIGN, &rev0); /* Get current microcode revision. */ /* * Perform update. */ wrmsr_safe(MSR_BIOS_UPDT_TRIG, (uintptr_t)(ptr)); wrmsr_safe(MSR_BIOS_SIGN, 0); /* * Serialize instruction flow. */ do_cpuid(0, tmp); critical_exit(); rdmsr_safe(MSR_BIOS_SIGN, &rev1); /* Get new microcode revision. */ The section of the SDM mentioned defines a procedure to safely determine the signature after a microcode update but it doesn't seem to be necessary to perform it each time the MSR is queried. (In reply to Mitchell Horne from comment #2) The PR is *not* about the updater code in the cpuctl module. It is about the missing initialization in microprocessor detection code in cpucontrol, which is executed to check whether an update is necessary at all. Basically it is like using uninitialized variable. Please re-read the PR carefully, to not misunderstand the problem. Well, no one has provided the patch, so I will try one. Let's allow me to summarize first. --- ------------ --- part of --- Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3A: --- System Programming Guide, Part 1 --- paragraph 9.11.7.1 Determining the Signature --- ------------ CPUID returns a value in a model specific register in addition to its usual register return values. The semantics of CPUID cause it to deposit an update ID value in the 64-bit model-specific register at address 08BH (IA32_BIOS_SIGN_ID). If no update is present in the processor, the value in the MSR remains unmodified. The BIOS must pre-load a zero into the MSR before executing CPUID. If a read of the MSR at 8BH still returns zero after executing CPUID, this indicates that no update is present. --- ------------ It's clear description - IA32_BIOS_SIGN_ID needs to be initialized to zero before CPUID or it's value can't be trusted after CPUID. ===== CPUCONTROL ==================================== --- ------------ --- releng/11.1/usr.sbin/cpucontrol/intel.c 245491 2013-01-16 05:00:51Z eadler --- intel_update() --- ------------ Such function calls CPUID (line 116) with no prior initialization of IA32_BIOS_SIGN_ID, then read MSR_BIOS_SIGN (line 133) considering returned value "revision". Such revision is later compared with the version of patch in attempt to decide the particular patch is newer than the one installed in CPU. This implementation is INCORRECT. On unpatched procesor, the value of IA32_BIOS_SIGN_ID remain unchanged, thus the 'revision' is undefined random value. Because of it, the patch candidate may be considered old. In such case, CPU remain in virgin state, with no attempt to update. I would like to note that the comment on sys/dev/cpuctl/cpuctl.c line 259 is unrelated to the issue we are speaking on. It's probably just copy&paste echo of line 214 Suggested patch: initialize IA32_BIOS_SIGN_ID to 0 before CPUID; patch attached Note - IA32_BIOS_SIGN_ID shall be initialized just before CPUID and read just after it - to be on safe site. Or we are in risk someone else will modify IA32_BIOS_SIGN_ID in the mean time. Userland code can't do it reliably. So the proposed patch need to be considered workaround. Final solution require new CPUCTL_x that will do WRMSR+CPUID+RDMSR sequence enclosed in critical_enter() and critical_exit() ===== CPUCTL_UPDATE ==================================== Despite this PR is not related to CPUCTL_UPDATE, I would like to make short notice claiming the implementation is broken as well and Mitchell Horne is not true. But this broken implementation doesn't cause the eligible update is skipped. --- ------------ --- releng/11.1/sys/dev/cpuctl/cpuctl.c 330908 2018-03-14 04:00:00Z gordon --- ------------ CPUCTL_UPDATE, e.g. cpuctl_do_update() starts with CPUID (line 311) with no prior initialization of IA32_BIOS_SIGN_ID. Then it reads such MSR (line 365) into rev0 variable. On virgin CPU it will receive random garbage. Then the code do update (line 370) then it try to read IA32_BIOS_SIGN_ID into rev1 value. Unfortunately it uses CPUID with EAX=0 instead of (see example 9-9 in Intel's SDK) CPUID with EAX=1. So even the rev1 content is undefined. Finally, it return 0 if rev1 > 0, otherwise it returns EEXIT. Garbage in rev0 and rev1 may cause EEXIST error even in the case the update has succeeded. Suggested patch: we can't initialize IA32_BIOS_SIGN_ID before CPUID (line 311) as it's called in Intel/AMD common code and must not touch Intel-only MSR here. Thus we need initialize initialize IA32_BIOS_SIGN_ID then call CPUID again in update_intel(), CPUID with EAX=1 needs to be called to read rev1, and related RDMSR should be part of critical section; patch attached Created attachment 191850 [details] Patch for cpucontrol to read Intel CPU revision correctly See comment #4 for details Created attachment 191851 [details] Patch for cpuctl to read Intel CPU revision correctly See comment #4 for details > As the cpucontrol program was paid work sponsored by the FreeBSD foundation,
This is untrue; the Foundation sponsored changes to cpucontrol as part of other work but the program already existed.
I will take a look at this.
Ed, this would be great. May I point you at another issue: in the current implementation, the eval_cpu_features system call output from cpuctl.ko goes to the console, spamming a half screenful of information - for each and every core again. This can be a *lot* of dmesg spam. I would be willing to address this too when merging-in Dan's suggested changes. Warner suggested that cpupdate being put on phabricator, but I haven't heard from him for a while. He seems very busy and told me that my initial reply got into his spam folder somehow. Could you maybe cc him and ask him whether he got my email from April 24th? What about putting cpupdate, the updated cpuctl and cpucontrol on phabricator then, so these can be reviewed together? A commit references this bug: Author: emaste Date: Sat May 12 15:34:35 UTC 2018 New revision: 333569 URL: https://svnweb.freebsd.org/changeset/base/333569 Log: cpucontrol: improve Intel microcode revision check According to the Intel SDM (Volme 3, 9.11.7) the BIOS signature MSR should be zeroed before executing cpuid (although in practice it does not seem to matter). PR: 192487 Submitted by: Dan Lukes Reported by: Henrique de Moraes Holschuh MFC after: 3 days Changes: head/usr.sbin/cpucontrol/intel.c A commit references this bug: Author: emaste Date: Wed Oct 10 15:54:01 UTC 2018 New revision: 339287 URL: https://svnweb.freebsd.org/changeset/base/339287 Log: MFC r333569: cpucontrol: improve Intel microcode revision check According to the Intel SDM (Volme 3, 9.11.7) the BIOS signature MSR should be zeroed before executing cpuid (although in practice it does not seem to matter). PR: 192487 Submitted by: Dan Lukes Reported by: Henrique de Moraes Holschuh Changes: _U stable/11/ stable/11/usr.sbin/cpucontrol/intel.c There is a commit referencing this PR, but it's still not closed and has been inactive for some time. Closing the PR as fixed but feel free to re-open it if the issue hasn't been completely resolved. Thanks ererf I'm updating my laptop's microcode as follows pkg install devcpu-data-intel vi /etc/loader.conf cpu_microcode_load="YES" cpu_microcode_name="/boot/firmware/06-8e-0c" (06-8e-0c downloaded from https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/tree/main/intel-ucode) This seems to work: dmesg | grep -C 4 microcode VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID TSC: P-state invariant, performance statistics real memory = 8589934592 (8192 MB) avail memory = 7844245504 (7480 MB) CPU microcode: updated from 0xea to 0xec <---- System runs BIOS version 1.75, version 0xea, which is updated Event timer "LAPIC" quality 600 ACPI APIC Table: <LENOVO TP-N2I > FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs FreeBSD/SMP: 1 package(s) x 4 core(s) x 2 hardware threads cpupdate -i Found CPU(s) from Intel Core 0 to 7: CPUID: 806ec Fam 06 Mod 8e Step 0c Flag 80 uCode 000000ec If cpucontrol is run manually it seems to try and load an incorrect microcode file: cpucontrol -vv -u /dev/cpuctl0 cpucontrol: found cpu type 0 family 0x6 model 0xe stepping 0xc. .. /usr/local/share/cpucontrol/06-86-05.01: updating cpu /dev/cpuctl0 from rev 0xec to rev 0xb00000f... failed. <---- incorrect file! cpucontrol: ioctl(): File exists .. cpucontrol: skipping /usr/local/share/cpucontrol/06-8e-0c.94 of rev 0xea: up to date .. cpucontrol: skipping /usr/local/share/cpucontrol/06-8e-0c of rev 0xec: up to date There still seems to be something wrong in the detection of Intel CPU revision on FreeBSD 13.0 Running: FreeBSD 13.0-RELEASE-p8 on Lenovo T590 kernel: CPU: Intel(R) Core(TM) i5-8365U CPU @ 1.60GHz (1896.11-MHz K8-class CPU) kernel: Origin="GenuineIntel" Id=0x806ec Family=0x6 Model=0x8e Stepping=12 (In reply to moon300web from comment #13) Please submit a separate PR |