If I attempt to start a Linux guest on a FreeBSD 12.0-CURRENT host I get a kernel panic similar to: panic: Unregistered use of FPU in kernel cpuid = 3 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe085a31c030 vpanic() at vpanic+0x182/frame 0xfffffe085a31c0b0 kassert_panic() at kassert_panic+0x126/frame 0xfffffe085a31c120 trap() at trap+0x7ae/frame 0xfffffe085a31c330 calltrap() at calltrap+0x8/frame 0xfffffe085a31c330 --- trap 0x16, rip = 0xffffffff827273a9, rsp = 0xfffffe085a31c408, rbp = 0xfffff e085a31c430 --- null_bug_bypass() at 0xffffffff827273a9/frame 0xfffffe085a31c430 null_bug_bypass() at 0xffffffff826985c7/frame 0x3 KDB: enter: panic if the VM is configured with more than one processor. I've seen this with both CentOS 7 and Ubuntu 12 guests. The panic appears to occur near the start of the guest kernel boot after grub has run. It appears to happen shortly after the kernel message about TSC calibration is printed. The symbols printed by DDB leading up to the trap appear to be somewhat arbitrary. The location of the trap seems to be aboe the topmost BSS section symbol in one of the (last?) loaded .kmod. The code at the location that triggers the trap is: 0xffffffff8272739d: nop 0xffffffff8272739e: nop 0xffffffff8272739f: nop 0xffffffff827273a0: mov %rsi,%rdx 0xffffffff827273a3: shr $0x20,%rdx 0xffffffff827273a7: mov %esi,%eax => 0xffffffff827273a9: xrstor (%rdi) 0xffffffff827273ac: retq 0xffffffff827273ad: int3 0xffffffff827273ae: int3 0xffffffff827273af: int3 0xffffffff827273b0: int3 It is called from here: 0xffffffff82667489: test %eax,%eax 0xffffffff8266748b: jne 0xffffffff826674a1 0xffffffff8266748d: movq $0x3,0x5238(%r15) 0xffffffff82667498: mov %rbx,%rsi 0xffffffff8266749b: and $0xfffffffffffffffc,%rsi 0xffffffff8266749f: je 0xffffffff826674ad 0xffffffff826674a1: mov 0x5240(%r15),%rdi 0xffffffff826674a8: callq 0xffffffff827273a0 => 0xffffffff826674ad: or %rbx,0x5238(%r15) 0xffffffff826674b4: mov %r14d,%eax 0xffffffff826674b7: add $0x8,%rsp kgdb (from ports) doesn't believe that either of these to any function. The VMs where I first saw the problem were initially created with Virtualbox 4 and the paravirtualization setting is "Legacy", but I can reproduce this panic after creating a new VM which uses the "Default" setting, increasing the number of processors to 4, and booting the CentOS 7 install .iso. The CPU info is: CPU: AMD FX-8320E Eight-Core Processor (3210.84-MHz K8-class CPU) Origin="AuthenticAMD" Id=0x600f20 Family=0x15 Model=0x2 Stepping=0 Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,C MOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> Features2=0x3e98320b<SSE3,PCLMULQDQ,MON,SSSE3,FMA,CX16,SSE4.1,SSE4.2,POPCNT,AE SNI,XSAVE,OSXSAVE,AVX,F16C> AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM> AMD Features2=0x1ebbfff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,I BS,XOP,SKINIT,WDT,LWP,FMA4,TCE,NodeId,TBM,Topology,PCXC,PNXC> Structured Extended Features=0x8<BMI1> SVM: NP,NRIP,VClean,AFlush,DAssist,NAsids=65536 TSC: P-state invariant, performance statistics Whether or not this problem occurs with Intel CPUs is unknown. This problem did not occur before the upgrade from Virtualbox 4 to Virtualbox 5.
I was unable to reproduce this with the CentOS 7 .iso on: FreeBSD 10.3-STABLE #11 r303852: Mon Aug 8 13:59:38 PDT 2016 dl@hoover:/usr/obj/usr/src/sys/GENERIC amd64 FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512 CPU: AMD FX(tm)-4100 Quad-Core Processor (3624.26-MHz K8-class CPU) Origin="AuthenticAMD" Id=0x600f12 Family=0x15 Model=0x1 Stepping=2 Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,C MOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> Features2=0x1e98220b<SSE3,PCLMULQDQ,MON,SSSE3,CX16,SSE4.1,SSE4.2,POPCNT,AESNI, XSAVE,OSXSAVE,AVX> AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM> AMD Features2=0x1c9bfff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,I BS,XOP,SKINIT,WDT,LWP,FMA4,NodeId,Topology,PCXC,PNXC> SVM: NP,NRIP,VClean,AFlush,DAssist,NAsids=65536 TSC: P-state invariant, performance statistics I'll try to set up a test on FreeBSD 11.0-BETA 4, but it will take a while.
Correction. I can't reproduce this problem with 10.3-STABLE GENERIC kernel, but I can if I enable INVARIANTS.
Setting paravirtualization to "Minimal" does not solve the problem.
FYI, this function is ASMXRstor(): https://www.virtualbox.org/browser/vbox/trunk/src/VBox/Runtime/common/asm/ASMXRstor.asm
It's been used since r55312: https://www.virtualbox.org/changeset/55312
Enabled for AMD-V in r55316: https://www.virtualbox.org/changeset/55316
Can you please confirm whether disabling AMD-V works around the issue?
If I try to disable the AMD-V setting, Virtualbox complains "Invalid Settings Detected". If I disable virtualization in the BIOS, Virtualbox only seems to understand 32-bit guests. I hadn't gone spelunking in the .kmod source because the stack traces fooled me into thinking that the code was not part of the .kmod. ASMXRstore() is called from CPUMSetGuestXcr0(PVMCPU pVCpu, uint64_t uNewValue) here: <https://www.virtualbox.org/browser/vbox/trunk/src/VBox/VMM/VMMAll/CPUMAllRegs.cpp>. Perhaps it just needs added calls to fpu_kern_enter() and fpu_kern_leave(). Interestingly I don't see any calls to ASMXSave().
That code has been present in Virtualbox for a while, but it is not present in version 4.3.38, which was the latest version of our port until the recent upgrade to 5.0.26. The system panics started after that upgrade.
A commit references this bug: Author: jkim Date: Sat Aug 13 04:05:35 UTC 2016 New revision: 420152 URL: https://svnweb.freebsd.org/changeset/ports/420152 Log: Temporarily disable AVX support for guest. It is unstable for FreeBSD. PR: 211651 Changes: head/emulators/virtualbox-ose/Makefile head/emulators/virtualbox-ose/files/patch-src_VBox_VMM_VMMR3_CPUMR3CpuId.cpp
My existing CentOS 7 VM won't boot with this change. It gets most of the way through boot, but then (when X is starting?) I get a white screen with a frowny face that says something went wrong and I should contact my administrator ;-( I rebuilt virtualbox without the patch and and CentOS 7 boot, though I have to restrict it to one processor, otherwise the host will panic. I'll try creating a new CentOS 7 guest with the patched Virtualbox. My existing CentOS 5 VM booted normally with the patch.
It looks like something related to Xorg is the culprit. I was able to reproduce this with a new CentOS 7 VM if I install GNOME. Possibly some component is assuming that it can use AVX and dies when it can't. I've seen that screen before on FreeBSD if gdm isn't able to start a session.
The workaround in r420152 has the undesirable side-effect of also disabling AVX and AVX2 extensions, which clang will use by default even on integer code. Since that workaround was committed, VBox has been updated from 5.0.26 to 5.2.2. Has anyone done any investigation to determine whether this bug is still present?
We have version 5.2.4. Does the problem with recent version still exists?
I just built VirtualBox 5.2.6 after removing files/patch-src_VBox_VMM_VMMR3_CPUMR3CpuId.cpp. I was able to start CentOS 7 and Ubuntu 12 guests. It looks like the problem is fixed upstream and we no longer require the patch.
A commit references this bug: Author: jkim Date: Tue Jan 23 17:30:50 UTC 2018 New revision: 459789 URL: https://svnweb.freebsd.org/changeset/ports/459789 Log: Re-enable AVX/AVX2 support for guest. This patch is no longer necessary according to the original reporter. PR: 211651 Changes: head/emulators/virtualbox-ose/Makefile head/emulators/virtualbox-ose/files/patch-src_VBox_VMM_VMMR3_CPUMR3CpuId.cpp