Bug 192487

Summary: cpucontrol uses unsafe procedure to detect current microcode version
Product: Base System Reporter: Henrique de Moraes Holschuh <hmh+freebsd>
Component: binAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed FIXED    
Severity: Affects Many People CC: chris, dan+freebsd.org, eadler, emaste, gonzo, mhorne063, moon300web, op, sblachmann
Priority: ---    
Version: CURRENT   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
Patch for cpucontrol to read Intel CPU revision correctly
none
Patch for cpuctl to read Intel CPU revision correctly none

Description Henrique de Moraes Holschuh 2014-08-07 19:07:15 UTC
The Intel SDM, vol 3A, section 9.11.7.1, defines a procedure that should be followed to ensure a valid response from the System Processor when it is queried about the running microcode revision.

The userspace cpucontrol utility fails to follow this procedure. cpucontrol fails to pre-init MSR 0x8B (MSR_BIOS_SIGN) with zero and issue a cpuid(1), before it can get valid data from that same MSR.

Please refer to example 9-9 in the Intel SDM vol 3A page 9-36.

I have no idea how important this procedure is on current Intel processors, but the impact of getting garbage from MSR_BIOS_SIGN includes either incorrectly downgrading, or incorrectly refusing to upgrade microcode.

The bug is in file: usr.sbin/cpucontrol/intel.c
Comment 1 Stefan B. 2018-03-02 02:46:35 UTC
(In reply to Henrique de Moraes Holschuh from comment #0)

> the impact of getting garbage from MSR_BIOS_SIGN includes ...
> ... incorrectly refusing to upgrade microcode.

This is confirmed.

As the cpucontrol program was paid work sponsored by the FreeBSD foundation, it seems to be sacrosanct in spite of being spaghetti.
Maybe this is the reason apparently nobody dares to touch it?
Or even replace its identification and updating code core with a better program like cpupdate? (see below)

Anyway, the fix for this issue would be to add this between line 117 and 118 in intel.c:

   msrargs.msr = MSR_BIOS_SIGN;
   msrargs.data = 0;
   error = ioctl( cpufd, CPUCTL_WRMSR, &msrargs);
   // note: cpucontrol's error messages are stupid and 
   //       do not help the user find what is wrong.
   if (error < 0) {
     WARN(0, "ioctl(%s)", dev);
     goto fail;
   }

The other major issue, usage of undefined reserved bits, can lead to similar problems.
(Hint: see lines 248 and 254 - there is missing some bitwise ANDing)
As my opinion is that this spaghetti code should be scrapped instead of being fixed, I am not eager to take the effort to look into the Intel programmers' manual to tell you what would belong there.

I wrote cpupdate, which does not have these issues, and works with the new composite Intel microcode file format. This will be the standard format after Intel dropped the legacy microcode.dat and .fw format.

If you want me to rework and tidy up cpupdate so it follows FreeBSD kernel coding guidelines and add some functions to make it replace the buggy/obsolete sections of cpucontrol, for including it in FreeBSD or HardenedBSD base, I would be willing to invest time into that.


Links:
https://github.com/kernschmelze/cpupdate
https://forums.freebsd.org/threads/introducing-cpupdate-a-microcode-tool-for-freebsd.64588
https://groups.google.com/forum/#!topic/mpc.lists.freebsd.hackers/JTtw6TNQqng
Comment 2 Mitchell Horne 2018-03-02 16:07:48 UTC
This procedure is not handled by either the cpucontrol or cpupdate utilities but instead by the underlying driver cpuctl. Inspection of the cpuctl code shows that this is being handled properly in the update_intel() function.

sys/dev/cpuctl/cpuctl.c starting at line 366:

  critical_enter();
  rdmsr_safe(MSR_BIOS_SIGN, &rev0); /* Get current microcode revision. */

  /*
   * Perform update.
   */
  wrmsr_safe(MSR_BIOS_UPDT_TRIG, (uintptr_t)(ptr));
  wrmsr_safe(MSR_BIOS_SIGN, 0);

  /*
   * Serialize instruction flow.	
   */
  do_cpuid(0, tmp);
  critical_exit();
  rdmsr_safe(MSR_BIOS_SIGN, &rev1); /* Get new microcode revision. */

The section of the SDM mentioned defines a procedure to safely determine the signature after a microcode update but it doesn't seem to be necessary to perform it each time the MSR is queried.
Comment 3 Stefan B. 2018-03-03 21:40:19 UTC
(In reply to Mitchell Horne from comment #2)

The PR is *not* about the updater code in the cpuctl module.

It is about the missing initialization in microprocessor detection code in cpucontrol, which is executed to check whether an update is necessary at all.
Basically it is like using uninitialized variable.

Please re-read the PR carefully, to not misunderstand the problem.
Comment 4 Dan Lukes 2018-03-26 20:54:57 UTC
Well, no one has provided the patch, so I will try one. Let's allow me to summarize first.

 --- ------------
 --- part of 
 --- Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3A:
 --- System Programming Guide, Part 1
 --- paragraph 9.11.7.1 Determining the Signature
 --- ------------
CPUID returns a value in a model specific register in addition to its usual register return values. The semantics of CPUID cause it to deposit an update ID value in the 64-bit model-specific register at address 08BH (IA32_BIOS_SIGN_ID). If no update is present in the processor, the value in the MSR remains unmodified. The BIOS must pre-load a zero into the MSR before executing CPUID. If a read of the MSR at 8BH still returns zero after executing CPUID, this indicates that no update is present.
 --- ------------

It's clear description - IA32_BIOS_SIGN_ID needs to be initialized to zero before CPUID or it's value can't be trusted after CPUID. 

 ===== CPUCONTROL ====================================

 --- ------------
 --- releng/11.1/usr.sbin/cpucontrol/intel.c 245491 2013-01-16 05:00:51Z eadler
 --- intel_update()
 --- ------------

Such function calls CPUID (line 116) with no prior initialization of IA32_BIOS_SIGN_ID, then read MSR_BIOS_SIGN (line 133) considering returned value "revision". Such revision is later compared with the version of patch in attempt to decide the particular patch is newer than the one installed in CPU.

This implementation is INCORRECT. On unpatched procesor, the value of IA32_BIOS_SIGN_ID remain unchanged, thus the 'revision' is undefined random value. Because of it, the patch candidate may be considered old. In such case, CPU remain in virgin state, with no attempt to update.

I would like to note that the comment on sys/dev/cpuctl/cpuctl.c line 259 is unrelated to the issue we are speaking on. It's probably just copy&paste echo of line 214

Suggested patch: initialize IA32_BIOS_SIGN_ID to 0 before CPUID; patch attached
 
Note - IA32_BIOS_SIGN_ID shall be initialized just before CPUID and read just after it - to be on safe site. Or we are in risk someone else will modify IA32_BIOS_SIGN_ID in the mean time. Userland code can't do it reliably. So the proposed patch need to be considered workaround. Final solution require new CPUCTL_x that will do WRMSR+CPUID+RDMSR sequence enclosed in critical_enter() and critical_exit()  

 ===== CPUCTL_UPDATE ====================================
Despite this PR is not related to CPUCTL_UPDATE, I would like to make short notice claiming the implementation is broken as well and Mitchell Horne is not true. But this broken implementation doesn't cause the eligible update is skipped.

 --- ------------
 --- releng/11.1/sys/dev/cpuctl/cpuctl.c 330908 2018-03-14 04:00:00Z gordon
 --- ------------

CPUCTL_UPDATE, e.g. cpuctl_do_update() starts with CPUID (line 311) with no prior initialization of IA32_BIOS_SIGN_ID. Then it reads such MSR (line 365) into rev0 variable. On virgin CPU it will receive random garbage.  Then the code do update (line 370) then it try to read IA32_BIOS_SIGN_ID into rev1 value. Unfortunately it uses CPUID with EAX=0 instead of (see example 9-9 in Intel's SDK) CPUID with EAX=1. So even the rev1 content is undefined.  Finally, it return 0 if rev1 > 0, otherwise it returns EEXIT. 

Garbage in rev0 and rev1 may cause EEXIST error even in the case the update has succeeded.

Suggested patch: we can't initialize IA32_BIOS_SIGN_ID before CPUID (line 311) as it's called in Intel/AMD common code and must not touch Intel-only MSR here. Thus we need initialize initialize IA32_BIOS_SIGN_ID then call CPUID again in update_intel(), CPUID with EAX=1 needs to be called to read rev1, and related RDMSR should be part of critical section; patch attached
Comment 5 Dan Lukes 2018-03-26 20:58:03 UTC
Created attachment 191850 [details]
Patch for cpucontrol to read Intel CPU revision correctly

See comment #4 for details
Comment 6 Dan Lukes 2018-03-26 20:59:24 UTC
Created attachment 191851 [details]
Patch for cpuctl to read Intel CPU revision correctly

See comment #4 for details
Comment 7 Ed Maste freebsd_committer freebsd_triage 2018-05-10 19:44:15 UTC
> As the cpucontrol program was paid work sponsored by the FreeBSD foundation,

This is untrue; the Foundation sponsored changes to cpucontrol as part of other work but the program already existed.

I will take a look at this.
Comment 8 Stefan B. 2018-05-10 21:22:27 UTC
Ed, this would be great.

May I point you at another issue: in the current implementation, the eval_cpu_features system call output from cpuctl.ko goes to the console, spamming a half screenful of information - for each and every core again. This can be a *lot* of dmesg spam.

I would be willing to address this too when merging-in Dan's suggested changes.

Warner suggested that cpupdate being put on phabricator, but I haven't heard from him for a while. He seems very busy and told me that my initial reply got into his spam folder somehow. Could you maybe cc him and ask him whether he got my email from April 24th?

What about putting cpupdate, the updated cpuctl and cpucontrol on phabricator then, so these can be reviewed together?
Comment 9 commit-hook freebsd_committer freebsd_triage 2018-05-12 15:35:24 UTC
A commit references this bug:

Author: emaste
Date: Sat May 12 15:34:35 UTC 2018
New revision: 333569
URL: https://svnweb.freebsd.org/changeset/base/333569

Log:
  cpucontrol: improve Intel microcode revision check

  According to the Intel SDM (Volme 3, 9.11.7) the BIOS signature MSR
  should be zeroed before executing cpuid (although in practice it does
  not seem to matter).

  PR:		192487
  Submitted by:	Dan Lukes
  Reported by:	Henrique de Moraes Holschuh
  MFC after:	3 days

Changes:
  head/usr.sbin/cpucontrol/intel.c
Comment 10 commit-hook freebsd_committer freebsd_triage 2018-10-10 15:54:54 UTC
A commit references this bug:

Author: emaste
Date: Wed Oct 10 15:54:01 UTC 2018
New revision: 339287
URL: https://svnweb.freebsd.org/changeset/base/339287

Log:
  MFC r333569: cpucontrol: improve Intel microcode revision check

  According to the Intel SDM (Volme 3, 9.11.7) the BIOS signature MSR
  should be zeroed before executing cpuid (although in practice it does
  not seem to matter).

  PR:		192487
  Submitted by:	Dan Lukes
  Reported by:	Henrique de Moraes Holschuh

Changes:
_U  stable/11/
  stable/11/usr.sbin/cpucontrol/intel.c
Comment 11 Oleksandr Tymoshenko freebsd_committer freebsd_triage 2019-01-21 09:45:15 UTC
There is a commit referencing this PR, but it's still not closed and has been inactive for some time. Closing the PR as fixed but feel free to re-open it if the issue hasn't been completely resolved.

Thanks
Comment 12 moon300web 2022-03-27 10:24:21 UTC
ererf
Comment 13 moon300web 2022-03-27 10:37:11 UTC
I'm updating my laptop's microcode as follows
pkg install devcpu-data-intel
vi /etc/loader.conf
  cpu_microcode_load="YES"
  cpu_microcode_name="/boot/firmware/06-8e-0c"
(06-8e-0c downloaded from https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/tree/main/intel-ucode)

This seems to work:
dmesg | grep -C 4 microcode
    VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
    TSC: P-state invariant, performance statistics
  real memory  = 8589934592 (8192 MB)
  avail memory = 7844245504 (7480 MB)
  CPU microcode: updated from 0xea to 0xec    <---- System runs BIOS version 1.75, version 0xea, which is updated
  Event timer "LAPIC" quality 600
  ACPI APIC Table: <LENOVO TP-N2I  >
  FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs
  FreeBSD/SMP: 1 package(s) x 4 core(s) x 2 hardware threads

cpupdate -i 
  Found CPU(s) from Intel
  Core 0 to 7: CPUID: 806ec  Fam 06  Mod 8e  Step 0c  Flag 80 uCode 000000ec

If cpucontrol is run manually it seems to try and load an incorrect microcode file:
cpucontrol -vv -u /dev/cpuctl0 
  cpucontrol: found cpu type 0 family 0x6 model 0xe stepping 0xc.
  ..
  /usr/local/share/cpucontrol/06-86-05.01: updating cpu /dev/cpuctl0 from rev 0xec to rev 0xb00000f... failed.  <---- incorrect file!
  cpucontrol: ioctl(): File exists
  ..
  cpucontrol: skipping /usr/local/share/cpucontrol/06-8e-0c.94 of rev 0xea: up to date
  ..
  cpucontrol: skipping /usr/local/share/cpucontrol/06-8e-0c of rev 0xec: up to date

There still seems to be something wrong in the detection of Intel CPU revision on FreeBSD 13.0

Running:
FreeBSD 13.0-RELEASE-p8
on Lenovo T590
kernel: CPU: Intel(R) Core(TM) i5-8365U CPU @ 1.60GHz (1896.11-MHz K8-class CPU)
kernel:   Origin="GenuineIntel"  Id=0x806ec  Family=0x6  Model=0x8e  Stepping=12
Comment 14 Ed Maste freebsd_committer freebsd_triage 2022-03-28 20:13:18 UTC
(In reply to moon300web from comment #13)
Please submit a separate PR