Bug 259878 - [hyper-v] Kernel hangs at boot after printing Hyper-V features
Summary: [hyper-v] Kernel hangs at boot after printing Hyper-V features
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.0-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: Mark Johnston
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-11-16 11:55 UTC by Thomas Eberhardt
Modified: 2021-11-22 13:48 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Eberhardt 2021-11-16 11:55:13 UTC
After commit a2ca269b3810 ("hyperv: Register hyperv_timecounter later during boot") my Gen2 Hyper-V VM with 6 CPUs (on Windows 11 Pro 22000.318) hangs at boot after printing the Hyper-V features but before printing the CPU-ID features and consumes 100% single core CPU.

After looking at the commit comments and code I changed

SYSINIT(hyperv_tc_init, SI_SUB_DRIVERS, SI_ORDER_FIRST, hyperv_tc_init, NULL);

to

SYSINIT(hyperv_tc_init, SI_SUB_LOCK + 1, SI_ORDER_FIRST, hyperv_tc_init, NULL);

in sys/dev/hyperv/vmbus/hyperv.c .

I'm really no kernel programmer so I don't know if this is the right thing to do, but this change fixed the problem for me.
Comment 1 Mark Johnston freebsd_committer freebsd_triage 2021-11-16 13:37:12 UTC
Thanks for the report.  This should have been fixed by this commit on stable/13: https://cgit.freebsd.org/src/commit/?id=a2ca269b38105a250cb1273290ccc6a4b200388d

Can you please try updating to this commit or later?
Comment 2 Thomas Eberhardt 2021-11-16 14:05:31 UTC
(In reply to Mark Johnston from comment #1)

That is the commit i was referencing. The boot hangs started with it.
Comment 3 Mark Johnston freebsd_committer freebsd_triage 2021-11-16 14:13:01 UTC
(In reply to Thomas Eberhardt from comment #2)
Sorry, I've had some coffee now.

Could you please try booting with debug.verbose_sysinit=1 set from the loader, and show the last few lines of output before the hang?  I would also like to see the CPU feature flags.
Comment 4 Thomas Eberhardt 2021-11-16 15:43:56 UTC
(In reply to Mark Johnston from comment #3)
Ok. I built a full debug stable/13 GENERIC kernel and learned how to add a serial console to a Hyper-V VM to capture the boot log.

boot log:
GDB: no debug ports present
KDB: debugger backends: ddb
KDB: current backend: ddb
---<<BOOT>>---
Copyright (c) 1992-2021 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 13.0-STABLE #0 stable/13-n248070-7d95b0f32832-dirty: Tue Nov 16 16:18:10 CET 2021
    root@jones.ocp.lan:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
FreeBSD clang version 12.0.1 (git@github.com:llvm/llvm-project.git llvmorg-12.0.1-0-gfed41342a82f)
WARNING: WITNESS option enabled, expect reduced performance.
subsystem ffffff
   parse_acpi_tables(0)... SRAT: Ignoring memory at addr 0x188000000
SRAT: Ignoring memory at addr 0x1000000000
SRAT: Ignoring memory at addr 0x10000000000
SRAT: Ignoring memory at addr 0x20000000000
[lots of output deleted]
   sd_mkdir_show_add(0)... done.
   sd_mkdir_list_show_add(0)... done.
   sd_allocdirect_show_add(0)... done.
   sd_allocindir_show_add(0)... done.
   ffs_show_add(0)... done.
   witness_show_add(0)... done.
   badstacks_show_add(0)... done.
   vpath_show_add(0)... done.
subsystem 2100000
   cpu_startup(0)...

and there it hangs. The only thing dirty in this built is the GENERIC kernel config with the added debug options. I also tried booting the VM with only 1 CPU instead of 6, but it hangs at the same place.


CPU-ID output from a build with my patch:

CPU: Intel(R) Core(TM) i7-9700T CPU @ 2.00GHz (1992.01-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x906ed  Family=0x6  Model=0x9e  Stepping=13
  Features=0x1f83fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2,SS,HTT>
  Features2=0xfeda3203<SSE3,PCLMULQDQ,SSSE3,FMA,CX16,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND,HV>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x121<LAHF,ABM,Prefetch>
  Structured Extended Features=0x9c2fb9<FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,NFPUSG,RDSEED,ADX,SMAP,CLFLUSHOPT>
  Structured Extended Features3=0xbc000400<MD_CLEAR,IBPB,STIBP,L1DFL,ARCH_CAP,SSBD>
  XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
  IA32_ARCH_CAPS=0xa7<RDCL_NO,IBRS_ALL,RSBA,MDS_NO,TSX_CTRL>
Comment 5 Thomas Eberhardt 2021-11-16 15:52:41 UTC
(In reply to Thomas Eberhardt from comment #4)
Perhaps i should also add the hyperv_init output:

[...]
subsystem 1a40000
   xen_hvm_sysinit(0)... done.
   hyperv_init(0)... Hyper-V Version: 10.0.22000 [SP0]
  Features=0x2e7f<VPRUNTIME,TMREFCNT,SYNIC,SYNTM,APIC,HYPERCALL,VPINDEX,REFTSC,IDLE,TMFREQ>
  PM Features=0x20 [C2]
  Features3=0xe0bed7b2<DEBUG,XMMHC,IDLE,NUMA,TMFREQ,SYNCMC,CRASH,NPIEP>
done.
subsystem 1ac0000
[...]
Comment 6 Mark Johnston freebsd_committer freebsd_triage 2021-11-16 20:47:44 UTC
Thanks.  It seems DELAY() is broken.  Please try this patch: https://reviews.freebsd.org/D33014
Comment 7 Thomas Eberhardt 2021-11-16 21:48:57 UTC
(In reply to Mark Johnston from comment #6)
The kernel boots with the patch from D33014. Thanks.
Comment 8 commit-hook freebsd_committer freebsd_triage 2021-11-19 22:34:51 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=ed6a9452be01c1b7805d0a7311211b8cf381a9dd

commit ed6a9452be01c1b7805d0a7311211b8cf381a9dd
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2021-11-19 22:30:05 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2021-11-19 22:30:05 +0000

    hyperv: Register the MSR-based timecounter during SI_SUB_HYPERVISOR

    This reverts commit 9ef7df022a46 ("hyperv: Register hyperv_timecounter
    later during boot") and adds a comment explaining why the timecounter
    needs to be registered as early as it is.

    PR:             259878
    Fixes:  9ef7df022a46 ("hyperv: Register hyperv_timecounter later during boot")
    Reviewed by:    kib
    MFC after:      3 days
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D33014

 sys/dev/hyperv/vmbus/hyperv.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)
Comment 9 commit-hook freebsd_committer freebsd_triage 2021-11-19 22:34:52 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=3339950117bedb5f880f6c08982dcc5dd43f9c34

commit 3339950117bedb5f880f6c08982dcc5dd43f9c34
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2021-11-19 22:29:28 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2021-11-19 22:29:28 +0000

    timecounter: Initialize tc_lock earlier

    Hyper-V wants to register its MSR-based timecounter during
    SI_SUB_HYPERVISOR, before SI_SUB_LOCK, since an emulated 8254 may not be
    available for DELAY().  So we cannot use MTX_SYSINIT to initialize the
    timecounter lock.

    PR:             259878
    Reviewed by:    kib
    MFC after:      3 days
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D33014

 sys/kern/kern_tc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
Comment 10 commit-hook freebsd_committer freebsd_triage 2021-11-22 13:46:33 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=686b143f37c501c79c0ddbbcb55ce852cc0bc846

commit 686b143f37c501c79c0ddbbcb55ce852cc0bc846
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2021-11-19 22:29:28 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2021-11-22 13:44:49 +0000

    timecounter: Initialize tc_lock earlier

    Hyper-V wants to register its MSR-based timecounter during
    SI_SUB_HYPERVISOR, before SI_SUB_LOCK, since an emulated 8254 may not be
    available for DELAY().  So we cannot use MTX_SYSINIT to initialize the
    timecounter lock.

    PR:             259878
    Reviewed by:    kib
    Sponsored by:   The FreeBSD Foundation

    (cherry picked from commit 3339950117bedb5f880f6c08982dcc5dd43f9c34)

 sys/kern/kern_tc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
Comment 11 commit-hook freebsd_committer freebsd_triage 2021-11-22 13:46:34 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=200d4732020742eb17f51b6e1c553cfac878a559

commit 200d4732020742eb17f51b6e1c553cfac878a559
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2021-11-19 22:30:05 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2021-11-22 13:45:34 +0000

    hyperv: Register the MSR-based timecounter during SI_SUB_HYPERVISOR

    This reverts commit 9ef7df022a46 ("hyperv: Register hyperv_timecounter
    later during boot") and adds a comment explaining why the timecounter
    needs to be registered as early as it is.

    PR:             259878
    Fixes:  9ef7df022a46 ("hyperv: Register hyperv_timecounter later during boot")
    Reviewed by:    kib
    Sponsored by:   The FreeBSD Foundation

    (cherry picked from commit ed6a9452be01c1b7805d0a7311211b8cf381a9dd)

 sys/dev/hyperv/vmbus/hyperv.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)
Comment 12 Mark Johnston freebsd_committer freebsd_triage 2021-11-22 13:48:42 UTC
Thank you for the report and for bisecting down to a single change.