Bug 227866 - Enabled IBRS causes hang on resume.
Summary: Enabled IBRS causes hang on resume.
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.1-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: Konstantin Belousov
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-04-30 09:25 UTC by Anonymized Account
Modified: 2020-05-06 05:52 UTC (History)
3 users (show)

See Also:
eugen: maintainer-feedback+


Attachments
Kernel config. (7.06 KB, text/plain)
2018-04-30 10:23 UTC, Anonymized Account
no flags Details
dmesg.boot (60.06 KB, text/plain)
2018-04-30 10:30 UTC, Anonymized Account
no flags Details
introduce rcorder support for resume (620 bytes, patch)
2018-04-30 13:18 UTC, Eugene Grosbein
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Anonymized Account freebsd_committer freebsd_triage 2018-04-30 09:25:24 UTC
I started to get presented a black screen and tight loop, sometimes a reboot when resuming with cpupdate_enable="YES" on a machine that had been resuming/suspending fine for 4 years.

Simply running "cpupdate -uw" manually causes no problem.

Most likely this started happening since I updated in the middle of April, but might be it still did happen before and I just shrugged it off as "meh, bad hardware".
Comment 1 Eugene Grosbein freebsd_committer freebsd_triage 2018-04-30 09:33:53 UTC
Please tell the version of the port you have.
And try to add cpupdate_irbs_enable="NO" to /etc/rc.conf, re-enable cpupdate_enable="YES" and see if this fixes the problem.
Comment 2 Anonymized Account freebsd_committer freebsd_triage 2018-04-30 09:38:01 UTC
I have g20180323, but I manually changed distversions cpupdate and cpm to latest Git now to see if the problem goes away, and it does not.

Yes, I think this started with commit r466680,

and I have such sysctls with rc script on:

hw.ibrs_disable: 0
hw.ibrs_active: 1

and they inverted when it is off (as expected?).

Will try your suggestion now.
Comment 3 Anonymized Account freebsd_committer freebsd_triage 2018-04-30 09:42:15 UTC
Yes, it works ok, and now I have

hw.ibrs_disable: 1
hw.ibrs_active: 0
Comment 4 Eugene Grosbein freebsd_committer freebsd_triage 2018-04-30 09:48:24 UTC
Activation of IRBS is expected after successful loading of CPU microcode with this utility unless disabled with cpupdate_irbs_enable knob.

Adding Konstantin Belousov as he may be interested in this case of unstable FreeBSD suspend/resume if IRBS enabled.

Michael, please describe your hardware.
Comment 5 Anonymized Account freebsd_committer freebsd_triage 2018-04-30 09:53:26 UTC
Lenovo B570e

hw.model: Intel(R) Celeron(R) CPU B800 @ 1.50GHz
hw.machine: amd64
hw.ncpu: 2
Comment 6 Eugene Grosbein freebsd_committer freebsd_triage 2018-04-30 09:54:45 UTC
And exact FreeBSD version including revision number.
Comment 7 Anonymized Account freebsd_committer freebsd_triage 2018-04-30 09:58:43 UTC
FreeBSD freebird 11.1-RELEASE-p9 FreeBSD 11.1-RELEASE-p9 #0: Tue Apr 17 01:02:31 CEST 2018     root@freebird:/usr/src/sys/amd64/compile/FREEBIRD  amd64
Comment 8 Eugene Grosbein freebsd_committer freebsd_triage 2018-04-30 10:12:49 UTC
Please attach your kernel config file and /var/run/dmesg.boot for verbose boot.
Comment 9 Anonymized Account freebsd_committer freebsd_triage 2018-04-30 10:23:16 UTC
Created attachment 192925 [details]
Kernel config.
Comment 10 Anonymized Account freebsd_committer freebsd_triage 2018-04-30 10:30:24 UTC
Created attachment 192926 [details]
dmesg.boot
Comment 11 Konstantin Belousov freebsd_committer freebsd_triage 2018-04-30 10:31:28 UTC
Try the patch from https://reviews.freebsd.org/D15236
Comment 12 Eugene Grosbein freebsd_committer freebsd_triage 2018-04-30 10:44:57 UTC
While testing the patch, don't forget to comment out cpupdate_irbs_enable and add "service cpupdate start" to your resume command sequence to re-load microcode and re-activate IRBS after resume.
Comment 13 Anonymized Account freebsd_committer freebsd_triage 2018-04-30 10:52:20 UTC
Do I have to add "service cpupdate restart" without the patch too? Why is this not documented in pkg-message.in?
Comment 14 Eugene Grosbein freebsd_committer freebsd_triage 2018-04-30 10:57:11 UTC
> Do I have to add "service cpupdate restart" without the patch too?

Yes. I'll add it to pkg-message with next update of the cpupdate.
Comment 15 Anonymized Account freebsd_committer freebsd_triage 2018-04-30 11:25:19 UTC
Also, the rc var is named cpupdate_irbs_enable whereas the name of the thing in question is IBRS, is that a typo?
Comment 16 Anonymized Account freebsd_committer freebsd_triage 2018-04-30 11:33:15 UTC
Still hangs with patch.
Comment 17 Eugene Grosbein freebsd_committer freebsd_triage 2018-04-30 11:34:49 UTC
(In reply to Michael Danilov from comment #15)

Yes, nice catch! I'll fix that too with next update.
Comment 18 Anonymized Account freebsd_committer freebsd_triage 2018-04-30 11:44:14 UTC
Is there a way to rebuild only the parts of the kernel that have changed? I'm kind  of out of patience to wait an hour for kernel builds.
Comment 19 Eugene Grosbein freebsd_committer freebsd_triage 2018-04-30 11:52:30 UTC
Sure. 

Use "make NO_KERNELCLEAN=yes" if you have not updated the whole source tree since  last build. Use "make NO_KERNELDEPEND=yes" if you have not changed kernel configuration file since last build. Use "make MODULES_WITH_WORLD=yes" or "make NO_MODULES=yes" to skip building modules but make sure you copied /boot/kernel/*.ko to /boot/modules/ before installing kernel build without modules and keep MODULES_WITH_WORLD/NO_MODULES for installkernel too.

These knobs may be compbined: make NO_KERNELCLEAN=yes NO_KERNELDEPEND=yes MODULES_WITH_WORLD=yes buildkernel && make MODULES_WITH_WORLD installkernel
Comment 20 Eugene Grosbein freebsd_committer freebsd_triage 2018-04-30 11:54:20 UTC
(In reply to Eugene Grosbein from comment #19)

Small correction, the last line should be:

make NO_KERNELCLEAN=yes NO_KERNELDEPEND=yes MODULES_WITH_WORLD=yes buildkernel && make MODULES_WITH_WORLD=yes installkernel
Comment 21 commit-hook freebsd_committer freebsd_triage 2018-04-30 12:33:54 UTC
A commit references this bug:

Author: eugen
Date: Mon Apr 30 12:33:05 UTC 2018
New revision: 468692
URL: https://svnweb.freebsd.org/changeset/ports/468692

Log:
  Minor updates to sysutils/cpupdate:

  - fix typo in cpupdate_ibrs_enable previously named cpupdate_irbs_enable;
  - catch up with upstream README.md update that does not state anymore
    that it is work in progress but mention it is for Intel only still;
  - catch up with platomav/CPUMicrocodes MCE DB r65 update for completeness
    despite it has only AMD updates comparing previous r64;
  - update pkg-message with note that suspend/resume sequence
    clears microcode update;
  - add new keyword "resume" to startup script to ease its invocation
    on resume by means of rcorder(8).

  PR:		227866
  Reported by:	Michael Danilov <mike.d.ft402@gmail.com>

Changes:
  head/sysutils/cpupdate/Makefile
  head/sysutils/cpupdate/distinfo
  head/sysutils/cpupdate/files/cpupdate.in
  head/sysutils/cpupdate/files/pkg-message.in
  head/sysutils/cpupdate/pkg-descr
Comment 22 Konstantin Belousov freebsd_committer freebsd_triage 2018-04-30 12:54:02 UTC
(In reply to Michael Danilov from comment #16)
I updated the review with Diff 41998.  Try it.
Comment 23 Eugene Grosbein freebsd_committer freebsd_triage 2018-04-30 13:18:27 UTC
Created attachment 192930 [details]
introduce rcorder support for resume

Please also test "full automation" for cpupdate's resume support:

- update ports tree and rebuild/reinstall sysutils/cpupdate to get version cpupdate-g20180324_1
- apply attached patch to the script /etc/rc.resume:

cd /etc
patch < /path/to/patch

- test suspend/resume, it should re-load microcode automatically.
Comment 24 Anonymized Account freebsd_committer freebsd_triage 2018-04-30 14:38:24 UTC
Patching file sys/x86/include/x86_var.h using Plan A...
Hunk #1 failed at 83.
1 out of 1 hunks failed--saving rejects to sys/x86/include/x86_var.h.rej

But I pasted in the needed declarations manually, going to compile.
Comment 25 Anonymized Account freebsd_committer freebsd_triage 2018-04-30 15:24:45 UTC
(In reply to Konstantin Belousov from comment #22)

Hi, tried it, still hanging.

Why should not setting sysctl hw.ibrs_disable before suspend work as a test?
Comment 26 Anonymized Account freebsd_committer freebsd_triage 2018-04-30 15:26:59 UTC
In fact, I have just tried a workaround by doing "sysctl hw.ibrs_disable=1" before suspend and =0 after, and it works.
Comment 27 Konstantin Belousov freebsd_committer freebsd_triage 2018-04-30 18:08:18 UTC
(In reply to Michael Danilov from comment #26)
Try Diff 42009
Comment 28 Anonymized Account freebsd_committer freebsd_triage 2018-04-30 19:12:54 UTC
(In reply to Konstantin Belousov from comment #27)

Seems to work! \o/
Comment 29 Eugene Grosbein freebsd_committer freebsd_triage 2018-04-30 20:13:07 UTC
(In reply to Michael Danilov from comment #28)

Had you a chance to test my patch for /etc/rc.resume, too?
Comment 30 commit-hook freebsd_committer freebsd_triage 2018-04-30 20:19:36 UTC
A commit references this bug:

Author: kib
Date: Mon Apr 30 20:18:32 UTC 2018
New revision: 333125
URL: https://svnweb.freebsd.org/changeset/base/333125

Log:
  Turn off IBRS on suspend.

  Resume starts CPU from the init state, which clears any loaded
  microcode updates.  As result, IBRS MSRs are no longer available,
  until the microcode is reloaded.

  I have to forcibly clear cpu_stdext_feature3, which assumes that CPUID
  leaf 7 reg %ebx does not report anything except Meltdown/Spectre bugs
  bits.  If future CPUs add new bits there, hw_ibrs_recalculate() and
  identify_cpu1()/identify_cpu2() need to be adjusted for that.

  Submitted and tested by:	Michael Danilov <mike.d.ft402@gmail.com>
  PR:	227866
  Sponsored by:	The FreeBSD Foundation
  MFC after:	1 week
  Differential revision:	https://reviews.freebsd.org/D15236

Changes:
  head/sys/x86/acpica/acpi_wakeup.c
  head/sys/x86/include/x86_var.h
Comment 31 Anonymized Account freebsd_committer freebsd_triage 2018-04-30 20:28:06 UTC
(In reply to Eugene Grosbein from comment #29)

Ah yes, sorry, I thought I had replied.

Yes, it seems to work (at least does not print any errors).
Comment 32 Anonymized Account freebsd_committer freebsd_triage 2018-05-19 11:28:53 UTC
11.1-RELEASE-p10 still hangs and I can noot compile it with the patch.
Comment 33 Anonymized Account freebsd_committer freebsd_triage 2018-05-19 11:34:36 UTC
Sorry, it built on the second attempt somehow.
Comment 34 Anonymized Account freebsd_committer freebsd_triage 2018-05-19 12:00:02 UTC
...yeah rigt, as I expected. Now it hangs even with the patch.

TBH I'm losing any motivation to use this OS. Just to keep it running has taken way more of my time than I ever meant since switching to it around 5 years ago. I wanna use my tools, not fight them with "temporary" crutches that I then have to locally maintain forever!

And if using it on just two laptops is so much trouble, I think I can imagine why businesses are bailing out. Of course server bugs are different but the general treatment is the same. For example, I doubt the panic on disconnecting a mounted drive being written to is ever going away.
Comment 35 Anonymized Account freebsd_committer freebsd_triage 2018-05-19 12:27:22 UTC
Setting cpupdate_ibrs_enable="NO" still works...
Comment 36 Ed Maste freebsd_committer freebsd_triage 2018-05-19 14:42:34 UTC
> Now it hangs even with the patch.

Sorry about that, over the next day or two fixes should go in and be merged to stable branches, followed by errata updates a little later.  I'll give it a test on head, stable/11, 11.2 snapshot, and let you know when the tree is ready for testing if you are so inclined. Once a patch or patches for 11.1-REL are ready they'll be tested and shared too.
Comment 37 Ed Maste freebsd_committer freebsd_triage 2018-05-19 14:48:32 UTC
(In reply to Ed Maste from comment #36)
Ah, I see this is report is specifically for the issue in 11.1, we'll take a close look at all of the changes and make sure they're merged as appropriate.
Comment 38 commit-hook freebsd_committer freebsd_triage 2018-11-26 13:23:59 UTC
A commit references this bug:

Author: eugen
Date: Mon Nov 26 13:23:11 UTC 2018
New revision: 340965
URL: https://svnweb.freebsd.org/changeset/base/340965

Log:
  MFC r339818: rcorder(8)

    Add support for /etc/rc.resume, so it calls
    "rcorder -k resume" and runs scripts containing "KEYWORD: resume"
    with single "resume" argument.

    Working example is the port sysutils/cpupdate that defines
    extra_commands="resume" to reload CPU microcode cleared
    by suspend/resume sequence.

    This change does nothing for a system having no scripts with
    KEYWORD: resume.

  PR:			227866
  Differential Revision:	https://reviews.freebsd.org/D15247

Changes:
_U  stable/12/
  stable/12/libexec/rc/rc.resume
  stable/12/sbin/rcorder/rcorder.8
  stable/12/share/man/man8/rc.8
  stable/12/usr.sbin/acpi/acpiconf/acpiconf.8
Comment 39 commit-hook freebsd_committer freebsd_triage 2018-11-26 13:30:06 UTC
A commit references this bug:

Author: eugen
Date: Mon Nov 26 13:30:01 UTC 2018
New revision: 340966
URL: https://svnweb.freebsd.org/changeset/base/340966

Log:
  MFC r339818: rcorder(8):

    Add support for /etc/rc.resume, so it calls
    "rcorder -k resume" and runs scripts containing "KEYWORD: resume"
    with single "resume" argument.

    Working example is the port sysutils/cpupdate that defines
    extra_commands="resume" to reload CPU microcode cleared
    by suspend/resume sequence.

    This change does nothing for a system having no scripts with
    KEYWORD: resume.

  PR:			227866
  Differential Revision:	https://reviews.freebsd.org/D15247

Changes:
_U  stable/11/
  stable/11/etc/rc.resume
  stable/11/sbin/rcorder/rcorder.8
  stable/11/share/man/man8/rc.8
  stable/11/usr.sbin/acpi/acpiconf/acpiconf.8
Comment 40 commit-hook freebsd_committer freebsd_triage 2018-11-26 13:37:14 UTC
A commit references this bug:

Author: eugen
Date: Mon Nov 26 13:36:31 UTC 2018
New revision: 340967
URL: https://svnweb.freebsd.org/changeset/base/340967

Log:
  MFC r339818: rcorder(8):

    Add support for /etc/rc.resume, so it calls
    "rcorder -k resume" and runs scripts containing "KEYWORD: resume"
    with single "resume" argument.

    Working example is the port sysutils/cpupdate that defines
    extra_commands="resume" to reload CPU microcode cleared
    by suspend/resume sequence.

    This change does nothing for a system having no scripts with
    KEYWORD: resume.

  PR:			227866
  Differential Revision:	https://reviews.freebsd.org/D15247

Changes:
_U  stable/10/
  stable/10/etc/rc.resume
  stable/10/sbin/rcorder/rcorder.8
  stable/10/share/man/man8/rc.8
  stable/10/usr.sbin/acpi/acpiconf/acpiconf.8
Comment 41 Eugene Grosbein freebsd_committer freebsd_triage 2018-11-26 13:45:37 UTC
Michael, is this problem solved now?
Comment 42 Anonymized Account freebsd_committer freebsd_triage 2018-11-26 13:57:36 UTC
Thanks, I will try copying the changes to my 11.2-RELEASE machine to see if it works.
Comment 43 Eugene Grosbein freebsd_committer freebsd_triage 2020-05-06 05:52:26 UTC
Believed to be fixed in all supported branches.