Similarly to the issues described in https://forums.freebsd.org/threads/36761/ and https://forums.freebsd.org/threads/47409/ I am unable to boot a 11.0-RELEASE guest on KVM unless I disable some CPU features in KVM. Specifically, I have to set -cpu Opteron_G3 instead of -cpu Opteron_G5 or -cpu host in order to be able to boot the VM. If needed, I can provide access to such a VM for testing & debugging.
Important bits from the forums are that the following things also help with the porblem: - setting hw.use_xsave=0 - upgrading a host system So, this looks like a possible bug in XSAVE implementation in the older KVM code.
Also forgot to mention: The same VM runs 10.3 without problems with both -cpu Opteron_G5 and -cpu host in place.
I can confirm this boot failure on a CentOS7 host running both FreeBSD-11.0-RC3-amd64 and pfSense-CE-2.4.0-DEVELOPMENT-amd64-latest (FreeBSD 11), as guests. Changing the CPU type to Westmere allows the VMs to boot and provides support for for the AES-NI instructions provided by the host CPU, as described in: https://forums.freebsd.org/threads/36761/#post-204537 The host CPU is an AMD A8-7670K Pfsense 2.3.2 (FreeBSD 10.3) boots fine as either Opeteron_G5 or via the host CPU copy facility built into virt-manager. Adding "hw.use_xsave=0" to /boot/loader.conf.local makes no difference.
It seems that PRs https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213333 and https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=214242 are similar or possibly identical to this issue. However, all three PRs just hang there as 'New'.
It's a bit difficult to follow as it appears there are several different issues reported here, and specifically, one of the forum links is for FreeBSD 9 and one is marked as solved. Can you please add additional detail beyond "unable to boot." - does the loader run correctly? - does the kernel panic? hang? is there any output at all? - if you perform a verbose boot from the loader, do you get any more information?
Created attachment 181655 [details] Screenshot of hung boot of FreeBSD-11.0-RELEASE-amd64.qcow2 as it appears when CPU is set to "copy host CPU configuration", or to Opeteron G5. Boot failure of downloadable KVM image FreeBSD-11.0-RELEASE-amd64.qcow2
Comment on attachment 181655 [details] Screenshot of hung boot of FreeBSD-11.0-RELEASE-amd64.qcow2 as it appears when CPU is set to "copy host CPU configuration", or to Opeteron G5. Adding boot_verbose="YES" to /boot/loader.conf makes no difference. The VM hangs at the point shown, and the progress spinner is static as pictured.
(In reply to Ed Maste from comment #5) Hi Ed, I am sorry if I was not clear enough in my wording. What I was trying to say was: - There have been (possibly related) problems around FreeBSD as a KVM-guest on recent Opterons before - Examples for that are the forum links mentioned - At least the 10.3-RELEASE does not show any problems when running FreeBSD as a KVM-guest on recent Opterons - Starting with 11.0, however, it does not work anymore. - Thus, I chose the phrase 'possible kernel regression'. Thanks to Bennett for adding the screenshot - this is the exact same behavior I also see.
Is it possible you can invoke QEMU with the -s option, and then attach gdb to it to see what instruction the FreeBSD kernel is executing? (e.g., attach gdb and run "info registers")
(In reply to Ed Maste from comment #9) I attached gdb from host system and got this output: (gdb) target remote :1234 Remote debugging using :1234 0x0000cb70 in ?? () (gdb) info registers eax 0xf78b 63371 ecx 0xcb66 52070 edx 0xdf00 57088 ebx 0xdf00 57088 esp 0xf7a0 0xf7a0 ebp 0x0 0x0 esi 0xf860 63584 edi 0x0 0 eip 0xcb70 0xcb70 eflags 0x6 [ PF ] cs 0xf000 61440 ss 0xdf00 57088 ds 0xdf00 57088 es 0xdf00 57088 fs 0x0 0 gs 0x0 0 Also, (I don't know if that is helpful), this is the output of qemu-integrated monitor: # info registers RAX=0000000000000015 RBX=000000004b4d564b RCX=00000000c0011005 RDX=0000000000001022 RSI=ffffffff816af46d RDI=f000e987f000fea5 RBP=ffffffff81a210b0 RSP=ffffffff800bff80 R8 =0000000000000001 R9 =0000000000000000 R10=ffffffff816ad760 R11=0000000000000000 R12=0000000000004000 R13=ffffffff81e23f90 R14=00000000021b0000 R15=ffffffff821a8000 RIP=ffffffff80f842f6 RFL=00010046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0028 0000000000000000 ffffffff 00a09300 DPL=0 DS [-WA] CS =0020 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA] SS =0028 0000000000000000 ffffffff 00a09300 DPL=0 DS [-WA] DS =0028 0000000000000000 ffffffff 00a09300 DPL=0 DS [-WA] FS =0028 0000000000000000 ffffffff 00a09300 DPL=0 DS [-WA] GS =0028 0000000000000000 ffffffff 00a09300 DPL=0 DS [-WA] LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT TR =0038 0000000000005f98 00002067 00008b00 DPL=0 TSS64-busy GDT= ffffffff81e2a7d0 00000067 IDT= ffffffff81d7d540 00000fff CR0=80000011 CR2=0000000000000000 CR3=0000000000058000 CR4=00000020 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000500 FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80 FPR0=0000000000000000 0000 FPR1=0000000000000000 0000 FPR2=0000000000000000 0000 FPR3=0000000000000000 0000 FPR4=0000000000000000 0000 FPR5=0000000000000000 0000 FPR6=0000000000000000 0000 FPR7=0000000000000000 0000 XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000 XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000 XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000 XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000 XMM08=00000000000000000000000000000000 XMM09=00000000000000000000000000000000 XMM10=00000000000000000000000000000000 XMM11=00000000000000000000000000000000 XMM12=00000000000000000000000000000000 XMM13=00000000000000000000000000000000 XMM14=00000000000000000000000000000000 XMM15=00000000000000000000000000000000
This is what I get for each of the two processes when set to Opteron G5: (gdb) info registers rax 0xfffffffffffffdfc -516 rbx 0x563d88e5e0b0 94822289760432 rcx 0x7f484c6ecdfd 139948496702973 rdx 0x29d 669 rsi 0xa 10 rdi 0x563d88e33700 94822289585920 rbp 0x7fff05e27bc4 0x7fff05e27bc4 rsp 0x7fff05e27bb0 0x7fff05e27bb0 r8 0x0 0 r9 0x0 0 r10 0x1 1 r11 0x293 659 r12 0x29d 669 r13 0x563d88238040 94822277021760 r14 0x563d87678f16 94822264704790 r15 0x563d88237e00 94822277021184 rip 0x7f484c6ecdfd 0x7f484c6ecdfd <poll+45> eflags 0x293 [ CF AF SF IF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 (gdb) info registers rax 0xfffffffffffffffc -4 rbx 0x563d89346000 94822294904832 rcx 0x7f484c6ee507 139948496708871 rdx 0x0 0 rsi 0xae80 44672 rdi 0x13 19 rbp 0x7f4856f20000 0x7f4856f20000 rsp 0x7f4841765a98 0x7f4841765a98 r8 0x0 0 r9 0xae7 2791 r10 0x8 8 r11 0x246 582 r12 0x563d8766c680 94822264653440 r13 0x563d89346000 94822294904832 r14 0x7f4841766700 139948312651520 r15 0x563d89346110 94822294905104 rip 0x7f484c6ee507 0x7f484c6ee507 <ioctl+7> eflags 0x246 [ PF ZF IF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 fs 0x0 0 gs 0x0 0
Hi Ed, is there anything else I can do to help get this fixed?
(In reply to Bennett from comment #11) Hi Bennett, which two processes did you examine with GDB? It looks like these are the result of running GDB on the host qemu process?
(In reply to Martin Waschbüsch from comment #2) I'm trying to find a convenient way to debug this further, however while I look into that, are you able to try booting FreeBSD 11.1 and FreeBSD 12 snapshots in the same environment?
(In reply to Ed Maste from comment #14) Sure, Ed. I will do that asap and post results here.
Ed, unfortunately, both 11.1-BETA3 and 12.0-CURRENT-amd64-20170626-r320360 showed the same behavior as the current stable 11.0-RELEASE.
(In reply to Ed Maste from comment #14) Hi Ed, just checking if there is anything else I can do to help diagnose this. As mentioned per mail, I am able to provide access to a KVM based virtual machine that shows this behavior (Hypervisor runs on a dual Opteron 6378 box): - KVM console on the host (QEMU monitor) - NoVNC console for the VM itself - Ability to choose iso image to boot from Please let me know if this helps. Thx, Martin PS: I upped the version to 11.1-RELEASE as that still shows the described problem.
It seems that 10.4-RELEASE is also affected.
(In reply to Martin Waschbüsch from comment #18) I can confirm that FreeBSD 10.4-release (and FreeBSD 10.4-stable) is also affected. For reference, a FreeBSD 10.3-release guest boots fine with virt-install --connect qemu:///system --name kvm1 --memory 512 --cpu host --vcpus 1 --cdrom /home/tingo/dl/bsd/fbsd/10.3/FreeBSD-10.3-RELEASE-amd64-disc1.iso --os-variant=freebsd10.3 --disk size=6 --virt-type=kvm --network=default --console pty,target_type=virtio for a FreeBSD 10.4 guest, I need --cpu kvm64 or --cpu Opteron_G3 to get it to boot. If I use --cpu host, --cpu Opteron_G5 or --cpu Opteron_G4 it just hangs after loading the kernel, just as the screenshot shows. The host machine is running Debian 9.1: tingo@kg-vm4:~$ lsb_release -a No LSB modules are available. Distributor ID: Debian Description: Debian GNU/Linux 9.1 (stretch) Release: 9.1 Codename: stretch tingo@kg-vm4:~$ uname -a Linux kg-vm4 4.9.0-3-amd64 #1 SMP Debian 4.9.30-2+deb9u5 (2017-09-19) x86_64 GNU/Linux the host cpu is a A10-6700T tingo@kg-vm4:~$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 2 Core(s) per socket: 2 Socket(s): 1 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 21 Model: 19 Model name: AMD A10-6700T APU with Radeon(tm) HD Graphics Stepping: 1 CPU MHz: 1400.000 CPU max MHz: 2500.0000 CPU min MHz: 1400.0000 BogoMIPS: 4990.55 Virtualization: AMD-V L1d cache: 16K L1i cache: 64K L2 cache: 2048K NUMA node0 CPU(s): 0-3 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate vmmcall bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold FWIW, setting hw.use_xsave=0 in the guest before booting doesn't help, it still hangs. Also, If I install a FreeBSD 10.3-release guest and upgrade it to FreeBSD 10.4-stable (source, make world procedure) it hangs too (unsurprising, but still).
I don't know if this is of any interest but I want to help to track down this issue. I started to enable and disable several features in libvirt's machine config to find out when it starts to hang. Here are some remarks worth to mention: This works: <cpu mode='custom' match='exact' check='partial'> <model fallback='allow'>Opteron_G3</model> <feature policy='optional' name='aes'/> <feature policy='optional' name='pclmuldq'/> <feature policy='optional' name='fma4'/> <feature policy='optional' name='avx'/> <feature policy='optional' name='ssse3'/> <feature policy='optional' name='sse4.2'/> <feature policy='optional' name='xop'/> <feature policy='optional' name='f16c'/> <feature policy='optional' name='pdpe1gb'/> <feature policy='optional' name='fma'/> <feature policy='optional' name='tbm'/> <feature policy='optional' name='sse4.1'/> <feature policy='optional' name='3dnowprefetch'/> </cpu> This does not work: <cpu mode='custom' match='exact' check='partial'> <model fallback='allow'>Opteron_G3</model> <feature policy='optional' name='aes'/> <feature policy='optional' name='pclmuldq'/> <feature policy='optional' name='fma4'/> <feature policy='optional' name='avx'/> <feature policy='optional' name='ssse3'/> <feature policy='optional' name='sse4.2'/> <feature policy='optional' name='xop'/> <feature policy='optional' name='f16c'/> <feature policy='optional' name='pdpe1gb'/> <feature policy='optional' name='fma'/> <feature policy='optional' name='tbm'/> <feature policy='optional' name='sse4.1'/> <feature policy='optional' name='3dnowprefetch'/> <feature policy='optional' name='xsave'/> </cpu> All these features are the differences between Opteron_G3 and Opteron_G5. Oddly enough, enabling xsave gives a kernel panic. panic: CPU0 does not support X87 or SSE: 0 (See screenshot attached. Note this is HardenedBSD but also affects FBSD 11.1-RELEASE) So I started to build up my config to start with model=qemu64 and enable all features as stated for Opteron_G5 in /usr/share/libvirt/cpu_map.xml <cpu mode='custom' match='exact' check='partial'> <model fallback='allow'>qemu64</model> <feature policy='optional' name='3dnowprefetch'/> <feature policy='optional' name='abm'/> <feature policy='optional' name='aes'/> <feature policy='optional' name='apic'/> <feature policy='optional' name='avx'/> <feature policy='optional' name='clflush'/> <feature policy='optional' name='cmov'/> <feature policy='optional' name='cx16'/> <feature policy='optional' name='cx8'/> <feature policy='optional' name='de'/> <feature policy='optional' name='f16c'/> <feature policy='optional' name='fma'/> <feature policy='optional' name='fma4'/> <feature policy='optional' name='fpu'/> <feature policy='optional' name='fxsr'/> <feature policy='optional' name='lahf_lm'/> <feature policy='optional' name='lm'/> <feature policy='optional' name='mca'/> <feature policy='optional' name='mce'/> <feature policy='optional' name='misalignsse'/> <feature policy='optional' name='mmx'/> <feature policy='optional' name='msr'/> <feature policy='optional' name='mtrr'/> <feature policy='optional' name='nx'/> <feature policy='optional' name='pae'/> <feature policy='optional' name='pat'/> <feature policy='optional' name='pclmuldq'/> <feature policy='optional' name='pdpe1gb'/> <feature policy='optional' name='pge'/> <feature policy='optional' name='pni'/> <feature policy='optional' name='popcnt'/> <feature policy='optional' name='pse'/> <feature policy='optional' name='pse36'/> <feature policy='optional' name='rdtscp'/> <feature policy='optional' name='sep'/> <feature policy='optional' name='sse'/> <feature policy='optional' name='sse2'/> <feature policy='optional' name='sse4.1'/> <feature policy='optional' name='sse4.2'/> <feature policy='optional' name='sse4a'/> <feature policy='optional' name='ssse3'/> <feature policy='optional' name='svm'/> <feature policy='optional' name='syscall'/> <feature policy='optional' name='tbm'/> <feature policy='optional' name='tsc'/> <feature policy='optional' name='xop'/> <feature policy='optional' name='xsave'/> </cpu> /var/run/dmesg.boot: CPU: QEMU Virtual CPU version 2.5+ (2300.30-MHz K8-class CPU) Origin="AuthenticAMD" Id=0x6d3 Family=0x6 Model=0xd Stepping=3 Features=0x783fbfd<FPU,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2> Features2=0xbeb83203<SSE3,PCLMULQDQ,SSSE3,FMA,CX16,SSE4.1,SSE4.2,x2APIC,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,HV> AMD Features=0x24100800<SYSCALL,NX,Page1GB,LM> AMD Features2=0x2109e1<LAHF,ABM,SSE4A,MAS,Prefetch,XOP,FMA4,TBM> I hope this helps someone. Oliver
Created attachment 187097 [details] Screenshot of kernel panic when enabling xsave
(In reply to Oliver Böttcher from comment #20) Sorry I lost track of my own comment. I forgot to tell that enabling all features and model=qemu64 booted.
(In reply to Oliver Böttcher from comment #22) Hi Oliver, thank you for adding these details. When you say that "enabling all features and model=qemu64 booted", does that include xsave? Also, do you happen to know what other differences, apart from cpu flags, there are between --model=qemu64 and --model=Opteron_G5? Thanks! Martin
(In reply to Martin Waschbüsch from comment #23) > When you say that "enabling all features and model=qemu64 booted", does that include xsave? Yes. The long snippet exposes all features of Opteron_G5 including xsave. The only difference I see is that the CPU is reported as QEMU Virtual CPU version 2.5+. My ultimately wild guess is that FreeBSD is doing some optimizations based on the reported CPU like workarounds for bugs which is necessary for bare metal but crashes virtualized environments not having these bugs.
Thank you all for continuing to chase this down. I expect that using QEMU64 as the CPU type, and adding the additional features is the best stopgap fix until this bug is actually resolved. I have been reading through the Linux Kernel mailing list as they work out the patches for Meltdown and Spectre, and suspect that misidentifying the CPU as an Intel Westmere device to gain AES support will cause problems once the new mitigation code goes live. There still might be a performance hit for FreeBSD guests with this solution in the future, as the QEMU64 cpu exposed by KVM will probably not show the new MSRs of a microcode patched Opeteron 4 or Opteron 5 (K8) CPU. This will cause the guest to default to the expensive "retpoline" mitigation for Spectre, rather than the more efficient microcode patched AMD specific "MFence" mitigation that is in the works.
Created attachment 191361 [details] Kernel patch to fix regression. I've bisected the kernel regression and tracked it to r297857: https://svnweb.freebsd.org/base?view=revision&revision=297857 I've also attached a patch that "fixes" the regression. I'm sure this isn't the proper fix, but at least now I can upgrade from FreeBSD 10.3. With this patch, the 10.4 kernel now boots just fine (it did not before). I am currently building the 11.1 release from source with this patch and will report back when done.
FreeBSD 11.1 also boots fine with this patch.
Created attachment 191445 [details] updated patch Could you please test this slightly updated patch?
(In reply to Andriy Gapon from comment #28) Actually, I think that checking against VM_GUEST_NONE is better than checking for CPUID2_HV, so please disregard comment #28.
A commit references this bug: Author: avg Date: Mon Mar 12 11:28:10 UTC 2018 New revision: 330793 URL: https://svnweb.freebsd.org/changeset/base/330793 Log: fix r297857, do not modify CPU extension bits under virtual machines r297857 was meant for real hardware only. PR: 213155 Submitted by: mainland@apeiron.net MFC after: 1 week Changes: head/sys/x86/x86/identcpu.c
*** Bug 213333 has been marked as a duplicate of this bug. ***
*** Bug 214242 has been marked as a duplicate of this bug. ***
Thank you all for identifying and fixing this!
Yes, this is fantastic! Thank you for tracking down the regression and drafting a patch. CPU consistency between host and VM, and correct identification, are likely to become increasingly important as the Meltdown/Spectre fixes make their way into the kernel. Under Linux, some of the mitigation strategies are dependent upon correct CPU identification, and microcode patching and capabilities flags may be passed up to the virtual guest.
A commit references this bug: Author: avg Date: Wed Mar 21 15:09:42 UTC 2018 New revision: 331303 URL: https://svnweb.freebsd.org/changeset/base/331303 Log: MFC r330793: fix r297857, do not modify CPU extension bits under virtual machines PR: 213155 Changes: _U stable/11/ stable/11/sys/x86/x86/identcpu.c
A commit references this bug: Author: avg Date: Wed Mar 21 15:13:48 UTC 2018 New revision: 331305 URL: https://svnweb.freebsd.org/changeset/base/331305 Log: MFC r330793: fix r297857, do not modify CPU extension bits under virtual machines PR: 213155 Changes: _U stable/10/ stable/10/sys/x86/x86/identcpu.c