Bug 270460

Summary: i386 CURRENT guest boot time panics in emulators/virtualbox-ose on amd64 host (FreeBSD 13.1, Windows 10, FreeBSD 13.2-RELEASE-p1)
Product: Base System Reporter: Paul Floyd <pjfloyd>
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: Open ---    
Severity: Affects Only Me CC: grahamperrin, kib, linimon
Priority: --- Keywords: crash, needs-qa
Version: CURRENT   
Hardware: i386   
OS: Any   
Attachments:
Description Flags
Update to latest sourceware package none

Description Paul Floyd 2023-03-26 07:52:01 UTC
I have no problem booting other i386 versions in a VBox VM.
Host is amd64 running FreeBSD 13.1.

I don't remember when this first started, something like 6 months ago.

Because of this problem I am unable to do any testing of Valgrind on 14.0 i386. 

Here is the start of the panics from the VBox VM serial output.

smist: found supported isa bridge Intel PIIX4 ISA bridge
panic: td 0x1d94840 stack 0x2424ee8 not in kstack VA 0x2420000 4
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper(0,1d94840,4,2424000,9,...) at db_trace_self_wrapper+0x28/frame 0x2424e9c
vpanic(1491c57,2424ed8,2424ed8,2424f9c,1415b35,...) at vpanic+0xf4/frame 0x2424eb8
panic(1491c57,1d94840,2424ee8,2420000,4,...) at panic+0x14/frame 0x2424ecc
trap(2424fa8,0,0,0,0,...) at trap+0x975/frame 0x2424f9c
calltrap() at 0xffc0321f/frame 0x2424f9c
--- trap 0x9, eip = 0xa02, esp = 0xffe, ebp = 0 ---
KDB: enter: panic
panic: td 0x1d94840 stack 0x2424d90 not in kstack VA 0x2420000 4
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper(0,1d94840,4,2424000,3,...) at db_trace_self_wrapper+0x28/frame 0x2424d44
vpanic(1491c57,2424d80,2424d80,2424e48,1415b35,...) at vpanic+0xf4/frame 0x2424d60
panic(1491c57,1d94840,2424d90,2420000,4,...) at panic+0x14/frame 0x2424d74
trap(2424e54,8,28,28,100,...) at trap+0x975/frame 0x2424e48
calltrap() at 0xffc0321f/frame 0x2424e48
--- trap 0x3, eip = 0x1042444, esp = 0x2424e94, ebp = 0x2424e94 ---
kdb_enter(1527748,1527748) at kdb_enter+0x34/frame 0x2424e94
vpanic(1491c57,2424ed8,2424ed8,2424f9c,1415b35,...) at vpanic+0x11f/frame 0x2424eb8
panic(1491c57,1d94840,2424ee8,2420000,4,...) at panic+0x14/frame 0x2424ecc
trap(2424fa8,0,0,0,0,...) at trap+0x975/frame 0x2424f9c
calltrap() at 0xffc0321f/frame 0x2424f9c
Comment 1 Graham Perrin freebsd_committer freebsd_triage 2023-03-26 12:02:45 UTC
(In reply to Paul Floyd from comment #0)

> I have no problem booting other i386 versions in a VBox VM. …

Which version of FreeBSD 14.0-CURRENT is problematic? 

uname -aKU

(Can you run that if you start the guest in safe mode?)

Are guest additions installed?

Also, please, for the host: 

pkg iinfo virtualbox-ose
Comment 2 Paul Floyd 2023-03-26 12:47:54 UTC
Roughly all versions for the last 6 months.

uname - difficult to get from a boot time kernel panic.

Booting VBox client with 

FreeBSD-14.0-CURRENT-i386-20230323-b5d43972e394-261711-disc1.iso


causes the panic within 10 seconds or so of the beastie boot screen.

Booting from pre-built the vm image also panics.

paulf> pkg iinfo virtualbox-ose
virtualbox-ose-6.1.36
virtualbox-ose-kmod-6.1.36

Safe mode makes no difference, and I've tried changing VBox options (memory, vido, USB) etc. No difference.
Comment 3 Paul Floyd 2023-05-19 07:59:22 UTC
Nobody trying to reproduce this issue?

Since I rely on VBox for Valgrind testing, I'll remove i386 as a primary target if this isn't fixed before the release of FreeBSD 14.
Comment 4 Paul Floyd 2023-05-19 08:16:16 UTC
I get the same problem with Windows 10 VBox host (same hardware)
Comment 5 Paul Floyd 2023-07-09 16:53:36 UTC
I've tried numerous disk images, the latest being


FreeBSD-14.0-CURRENT-i386-20230706-884eaacd24bd-263985-disc1.iso
Comment 6 Graham Perrin freebsd_committer freebsd_triage 2023-07-09 16:58:12 UTC
paulf today at <https://matrix.to/#/!viERHxvEzJvUugNxtg:libera.chat/$WgT9HA66UTrXgELGvYplR2e_5tCjuZ3thStuRzreZb8?via=libera.chat&via=matrix.org>: 

> … upgraded to 13.2 p1 today
Comment 7 Graham Perrin freebsd_committer freebsd_triage 2023-07-09 18:57:02 UTC
i386 guest, amd64 host: 

FreeBSD freebsd 14.0-CURRENT FreeBSD 14.0-CURRENT #0 main-n263985-884eaacd24bd: Thu Jul  6 07:55:21 UTC 2023     root@releng1.nyi.freebsd.org:/usr/obj/usr/src/i386.i386/sys/GENERIC i386 1400093 1400093

- panic-free.

Origin: 

- FreeBSD-14.0-CURRENT-i386.vhd

- from FreeBSD-14.0-CURRENT-i386.vhd.xz at 
  <https://download.freebsd.org/snapshots/VM-IMAGES/14.0-CURRENT/i386/Latest/>.

Host, FreeBSD 14.0-CURRENT: 

% pkg iinfo virtualbox
virtualbox-ose-6.1.44_3
virtualbox-ose-kmod-6.1.44
% uname -aKU
FreeBSD mowa219-gjp4-8570p-freebsd 14.0-CURRENT FreeBSD 14.0-CURRENT #0 main-n263900-cd9da8d072e4-dirty: Sat Jul  1 23:55:17 BST 2023     grahamperrin@mowa219-gjp4-8570p-freebsd:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 1400092 1400092
%
Comment 8 Paul Floyd 2023-07-09 20:05:20 UTC
host lscpu

Model name:              Intel(R) Xeon(R) CPU           W3520  @ 2.67GHz

Flags:                   fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 cflsh ds acpi mmx fxsr sse sse2 ss htt tm pbe sse3 dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt syscall nx rdtscp lm lahf_lm

lscpu on 13.1 i386 running on vbox


Model name:              Intel(R) Xeon(R) CPU           W3520  @ 2.67GHz

Flags:                   fpu vme de pse tsc msr mce cx8 apic sep mtrr pge mca cmov pat pse36 cflsh mmx fxsr sse sse2 htt sse3 ssse3 cx16 sse4_1 sse4_2 popcnt rdtscp lahf_lm
Comment 9 Graham Perrin freebsd_committer freebsd_triage 2023-07-09 20:34:10 UTC
(In reply to Paul Floyd from comment #4)

> … (same hardware)

For test purposes: can you boot a fresh amd64 14.0-CURRENT environment (entirely separate from your installation of 13.2-RELEASE-p1) on the same host, then install VirtualBox and try FreeBSD-14.0-CURRENT-i386.vhd as the virtual hard disk for a 32-bit guest?
Comment 10 Graham Perrin freebsd_committer freebsd_triage 2023-07-09 20:48:56 UTC
kib@ as a tentative cc recipient, only because I stumbled across a discussion that preceded this bug report. In particular: 

<https://lists.freebsd.org/archives/freebsd-current/2022-December/003011.html> | <https://marc.info/?l=freebsd-current&m=167236126709464>
Comment 11 Paul Floyd 2023-07-10 11:30:35 UTC
(In reply to Graham Perrin from comment #9)

That would be a major effort.

I'm pretty sure this is a FreeBSD i386 bug, and not a VirtualBox host issue. Early builds of 14.0 i386 had no problem. My suspicion is that this is in some way related to hwcap.

What can I do from the kernel debugger prompt?

If I run 'bt' I just get into an endless loop of stuff like

vpanic(21359179,33705000,33705000,33705200,20521846,...) at vpanic+286/frame 0x33704968
panic(21359179,27288864,33705016,33685504,4,...) at panic+20/frame 0x33704988
--More--
Comment 12 Yuri Pankov freebsd_committer freebsd_triage 2023-07-10 11:46:51 UTC
FWIW, I just tried booting FreeBSD-14.0-CURRENT-i386-20230706-884eaacd24bd-263985-disc1.iso (latest at the moment) in VirtualBox 7.0.8 running on Windows 11 host (host CPU is i9-11900K), no issues seen.  The VM has 2048MB of RAM and USB controllers removed, everything else is default.
Comment 13 Paul Floyd 2023-07-10 12:01:39 UTC
I can install on my old Solaris 11 machine (an Opteron CPU and stuck on VirtualBox 6)

So I do think that this is a bug related to the CPU hwcaps.
Comment 14 Graham Perrin freebsd_committer freebsd_triage 2023-07-11 07:03:42 UTC
Guest, installed (Paul, ignore what I wrote in IRC about long mode): 

Script started on Tue Jul 11 09:00:37 2023
root@i386-current:~ # uname -aKU
FreeBSD i386-current 14.0-CURRENT FreeBSD 14.0-CURRENT #0 main-n263985-884eaacd24bd: Thu Jul  6 07:55:21 UTC 2023     root@releng1.nyi.freebsd.org:/usr/obj/usr/src/i386.i386/sys/GENERIC i386 1400093 1400093
root@i386-current:~ # lscpu
Architecture:            i386
Byte Order:              Little Endian
Total CPU(s):            1
Thread(s) per core:      1
Core(s) per socket:      1
Socket(s):               1
Vendor:                  GenuineIntel
CPU family:              6
Model:                   58
Model name:              Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz
Stepping:                9
L1d cache:               32K
L1i cache:               32K
L2 cache:                256K
L3 cache:                4M
Flags:                   fpu vme de pse tsc msr mce cx8 apic sep mtrr pge mca cmov pat pse36 cflsh mmx fxsr sse sse2 htt sse3 pclmulqdq monitor ssse3 cx16 pcid sse4_1 sse4_2 popcnt aes xsave osxsave avx rdrnd fsgsbase rdtscp lahf_lm
root@i386-current:~ # exit

Script done on Tue Jul 11 09:00:51 2023


----


Host: 

% uname -aKU
FreeBSD mowa219-gjp4-8570p-freebsd 14.0-CURRENT FreeBSD 14.0-CURRENT #0 main-n263900-cd9da8d072e4-dirty: Sat Jul  1 23:55:17 BST 2023     grahamperrin@mowa219-gjp4-8570p-freebsd:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 1400092 1400092
% lscpu
Architecture:            amd64
Byte Order:              Little Endian
Total CPU(s):            4
Thread(s) per core:      2
Core(s) per socket:      2
Socket(s):               1
Vendor:                  GenuineIntel
CPU family:              6
Model:                   58
Model name:              Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz
Stepping:                9
L1d cache:               32K
L1i cache:               32K
L2 cache:                256K
L3 cache:                4M
Flags:                   fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 cflsh ds acpi mmx fxsr sse sse2 ss htt tm pbe sse3 pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline aes xsave osxsave avx f16c rdrnd fsgsbase smep erms syscall rdtscp lm lahf_lm
%
Comment 15 Paul Floyd 2023-08-27 09:18:15 UTC
Not sure exactly when this changed, but I can now successfully install and boot 14.0 i386 alpha 3

FreeBSD freebsd 14.0-ALPHA3 FreeBSD 14.0-ALPHA3 i386 1400097 #0 stable/14-n265022-2af9390e54ed: Fri Aug 25 05:38:58 UTC 2023     root@releng1.nyi.freebsd.org:/usr/obj/usr/src/i386.i386/sys/GENERIC i386

Marking this closed / overcome by events.
Comment 16 Paul Floyd 2023-09-01 19:51:53 UTC
This came back with 15.0

FreeBSD-15.0-CURRENT-i386-20230831-e04c4b4a369d-265091-disc1.iso
Comment 17 Paul Floyd 2024-09-13 19:19:05 UTC
What's new (at least), the following bugfixes (numbers are kde.org bugzilla IDs).

202770  open fd at exit --log-socket=127.0.0.1:1500 with --track-fds=yes
276780  An instruction in fftw (Fast Fourier Transform) is unhandled by
        valgrind: vex x86->IR: unhandled instruction bytes:
        0x66 0xF 0x3A 0x2
311655  --log-file=FILE leads to apparent fd leak
317127  Fedora18/x86_64 --sanity-level=3 : aspacem segment mismatch
337388  fcntl works on Valgrind's own file descriptors
377966  arm64 unhandled instruction dc zva392146  aarch64: unhandled
        instruction 0xD5380001 (MRS rT, midr_el1)
391148  Unhandled AVX instruction vmovq %xmm9,%xmm1
392146  aarch64: unhandled instruction 0xD5380001 (MRS rT, midr_el1)
412377  SIGILL on cache flushes on arm64
417572  vex amd64->IR: unhandled instruction bytes: 0xC5 0x79 0xD6 0xED 0xC5
447989  Support Armv8.2 SHA-512 instructions
453044  gbserver_tests failures in aarch64
479661  Valgrind leaks file descriptors
486293  memccpy false positives
487439  SIGILL in JDK11, JDK17
487993  Alignment error when using Eigen with Valgrind and -m32
488026  Use of `sizeof` instead of `strlen
488379  --track-fds=yes errors that cannot be suppressed with --xml-file=
488441  Add tests for --track-fds=yes --xml=yes and fd suppression tests
489040  massif trace change to show the location increasing the stack
489088  Valgrind throws unhandled instruction bytes: 0xC5 0x79 0xD6 0xE0 0xC5
489338  arm64: Instruction fcvtas should round 322.5 to 323, but result is 322.
489676  vgdb handle EINTR and EAGAIN more consistently
490651  Stop using -flto-partition=one
491394  (vgModuleLocal_addDiCfSI): Assertion 'di->fsm.have_rx_map &&
        di->fsm.rw_map_count' failed
492663  Valgrind ignores debug info for some binaries
Comment 18 Paul Floyd 2024-09-13 19:21:18 UTC
Created attachment 253546 [details]
Update to latest sourceware package

Updates for FreeBSD 13.4, adds aarch64 support. See list if bugfixes from NEWS.
Comment 19 Paul Floyd 2024-09-13 19:23:29 UTC
Ignore all the previous changes - I was working with the wrong browser tab.
Comment 20 Mark Linimon freebsd_committer freebsd_triage 2024-09-14 00:19:29 UTC
(In reply to Paul Floyd from comment #19)
So is this patch about valgrind-devel functionality or ... ?
Comment 21 Paul Floyd 2024-09-14 03:37:20 UTC
(In reply to Mark Linimon from comment #20)
Yes it is for https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281475

I wasn’t paying attention to which browser tab I was adding stuff to.
Comment 22 Mark Linimon freebsd_committer freebsd_triage 2024-09-29 09:26:53 UTC
(In reply to Paul Floyd from comment #16)
Is this still a problem in 15-CURRENT?
Comment 23 Paul Floyd 2024-09-29 14:17:06 UTC
Yes.

The problem seems to go away with BETA and RELEASE versions. Isn't 15 going to remove 32bit support? If so the issue will go away at the end of the 14 cycle.