Bug 277200 - emulators/xen-kernel Dom0 built with FreeBSD 14 fails to boot
Summary: emulators/xen-kernel Dom0 built with FreeBSD 14 fails to boot
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: amd64 Any
: --- Affects Some People
Assignee: Roger Pau Monné
URL:
Keywords:
: 277199 (view as bug list)
Depends on:
Blocks:
 
Reported: 2024-02-20 17:46 UTC by mgrooms
Modified: 2024-11-26 00:00 UTC (History)
1 user (show)

See Also:
bugzilla: maintainer-feedback? (royger)


Attachments
Booting FreeBSD 14 with xen-kernel built with 14 (149.47 KB, image/png)
2024-02-20 17:47 UTC, mgrooms
no flags Details
Fix clang codegen (3.83 KB, patch)
2024-02-21 17:01 UTC, Roger Pau Monné
no flags Details | Diff
Fix candidate for 14 (5.56 KB, patch)
2024-07-18 09:47 UTC, Roger Pau Monné
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description mgrooms 2024-02-20 17:46:11 UTC
When I attempt to boot FreeBSD using the xen kernel compiled with FreeBSD 14 ( llvm v16.0.6 ), the FreeBSD kernel fails to boot. I gets stuck during the Launching APs step.

I force installed the xen-kernel package from a FreeBSD 13.2 system and that boots fine, so I assume this is an issue with code generated with the newer LLVM mc. Boot video attached.
Comment 1 mgrooms 2024-02-20 17:47:44 UTC
Created attachment 248646 [details]
Booting FreeBSD 14 with xen-kernel built with 14
Comment 2 mgrooms 2024-02-20 17:48:05 UTC
Sorry. Video was too large. Included a pic.
Comment 3 Li-Wen Hsu freebsd_committer freebsd_triage 2024-02-20 20:56:58 UTC
*** Bug 277199 has been marked as a duplicate of this bug. ***
Comment 4 Roger Pau Monné freebsd_committer freebsd_triage 2024-02-21 08:16:52 UTC
Just picked up the 14.0 build of xen-kernel and can indeed reproduce this.  Looking into it.
Comment 5 Roger Pau Monné freebsd_committer freebsd_triage 2024-02-21 17:01:26 UTC
Created attachment 248661 [details]
Fix clang codegen

The following should fix it, will submit to xen-devel for review.
Comment 6 Roger Pau Monné freebsd_committer freebsd_triage 2024-02-22 08:19:42 UTC
For the record, here is the bug report against llvm:

https://github.com/llvm/llvm-project/issues/82598
Comment 7 mgrooms 2024-02-22 16:00:14 UTC
Thanks for the help with this Roger. It's very much appreciated.

I'll see if I can get the port to build with a patch based on your diff and give it a spin. On a related note, when I booted the FreeBSD 14 dom0 using the xen-kernel pkg build with 13.2, I noticed that the interrupts were showing > %50 in top while idle. Did you notice anything like that? It 'seemed' responsive enough though, so may be just an accounting bug of some sort.
Comment 8 Roger Pau Monné freebsd_committer freebsd_triage 2024-02-22 16:26:14 UTC
Hm, it's possible the build in 13.2 is also affected by this code generation issue, albeit in a different way.  I will update the port and add the fix while we wait for it to be review upstream.
Comment 9 Roger Pau Monné freebsd_committer freebsd_triage 2024-02-22 17:32:38 UTC
I'm also seeing the weird interrupt usage in top, however `vmstat -i` doesn't show any interrupt source has having a high rate.  It will need some investigation, could you raise a separate ticket for it and assign it to me?

For what is worth, I think it's a cosmetic issue, as performance seems to be OK (at least on my end).
Comment 10 Roger Pau Monné freebsd_committer freebsd_triage 2024-02-22 17:49:11 UTC
I've now updated both the xen-kernel and xen-tools package to 4.18.0.20240201 and included the code generation fix in xen-kernel.  I think we will also need it for the pvshim (which is part of xen-tools), but I will backport that one once it's accepted upstream, as it's not so critical.
Comment 11 mgrooms 2024-03-09 02:27:23 UTC
Hey Roger. I can confirm that FreeBSD 14 boots after upgrading to the following ...

xen-kernel: 4.18.0.20231212 -> 4.18.0.20240201
xen-tools: 4.18.0.20231212 -> 4.18.0.20240201_1

Thanks again for your help with this!
Comment 12 Marian Arlt 2024-07-01 16:04:15 UTC
I stumbled upon this very behavior on a Xen Dom0 server which I wanted to upgrade from 13.3-p3 to 14.x
I get a kernel panic every single time, even with every other service disabled. I noticed that clang versions between these FreeBSD version numbers are all over the place. My current and working setup (had to roll back) is FreeBSD 13.3-RELEASE-p3 with Xen kernel 4.18.0.20240201 and clang version 17.0.6

When I upgrade FreeBSD to 14.0-RELEASE-p6/7 it looks like it rolled back clang to 16.0.6. I can disable Xen at that point and boot the system fine, among other things upgrading further to 14.1-RELEASE-p1 e.g. where clang suddenly is 18.1.5 but also produces the same panic.
It looks to me from this thread that it might be hardware specific in relation to clang. But I really have no clue where to go with this.

This is the kernel panic console output including CPU specs:

FreeBSD 14.1-RELEASE releng/14.1-n267679-10e31f0946d8 GENERIC amd64
FreeBSD clang version 18.1.5 (https://github.com/llvm/llvm-project.git llvmorg-18.1.5-0-g617a15a9eac9)
VT(vga): text 80x25
XEN: Hypervisor version 4.18 detected.
CPU: Intel(R) Xeon(R) CPU E3-1230 V2 @ 3.30GHz (3292.52-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x306a9  Family=0x6  Model=0x3a  Stepping=9
  Features=0x1fc3fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT>
  Features2=0xbfba2203<SSE3,PCLMULQDQ,SSSE3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,HV>
  AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
  AMD Features2=0x1<LAHF>
  Strucutred Extended Features=0x281<FSGSBASE,SMEP,ERMS>
  Structured Extended Features3=0x20000000<ARCH_CAP>
  XSAVE Features=0x1<XSAVEOPT>
  IA32_ARCH_CAPS=0x4000000
  AMD Extended Feature Extensions ID EBX=0x100000
  TSC: P-state invariant
ACPI APIC Table: <ALASKA A M I>
Package ID shift: 4
L3 cache ID shift: 4
L2 cache ID shift: 1
L1 cache ID shift: 1
Core ID shift: 1
AP boot address 0x1000
panic: AP #1 (PHY# 1) failed!
cupid = 0
time = 1
KDB: stack backtrace:
#0 0xffffffff80b7fbfd at kdb_backtrace+0x5d
#1 0xffffffff80b32961 at vpanic+0x131
#2 0xffffffff80b32823 at panic+0x43
#3 0xffffffff80fe22c2 at start_all_aps+0x592
#4 0xffffffff80fe1d23 at cpu_mp_start+0x1a3
#5 0xffffffff80b936de at topo_analyze+0x42e
#6 0xffffffff80abb425 at mi_startup+0xb5
#7 0xffffffff80fde0bc at xen_start32+0xbc
Uptime: 1s
Rebooting...
Comment 13 Roger Pau Monné freebsd_committer freebsd_triage 2024-07-01 16:44:09 UTC
(In reply to Marian Arlt from comment #12)
Do you have serial output configured for this box so that you get the Xen output on the serial?  Otherwise it's going to be complicated to diagnose.

If so, can you try to boot with xen_kernel="/boot/xen-debug" in loader.conf and paste the full output that you get on the serial?
Comment 14 Marian Arlt 2024-07-03 11:34:38 UTC
I'm currently trying to set this up. I do have a COM card and adapter cable for this box but I have had immense trouble setting this up in the past. I'm now trying again, but I'm suffering a lot. I doubt I'll get anywhere with this to be honest. If I ever succeed in getting more output I'll report it as soon as possible...
Comment 15 Roger Pau Monné freebsd_committer freebsd_triage 2024-07-03 13:51:08 UTC
(In reply to Marian Arlt from comment #14)
Given that you have a server grade CPU (Xeon(R) CPU E3-1230), doesn't you box have support for Serial over LAN?  It's usually possible to get the serial output from the box BMC.

If that's not possible, there are other ways to debug, it's just that using a serial is the more reliable one.

Can you paste the contents of your /boot/loader.conf?  The xen-kernel package that you are using, is it from the pkg builders, or did you build it from ports?

Can you try to add 'vga=keep' to your xen_cmdline option and set xen_kernel="/boot/xen-debug" in loader.conf, and see if you get more output that way?
Comment 16 Marian Arlt 2024-07-05 19:36:55 UTC
Working(!) config in 13.3-RELEASE-p3:

/boot/loader.conf
autoboot_delay="3"
loader_logo="none"
loader_color="NO"
beastie_disable="YES"
vbe_max_resolution="1080p"
kern.geom.label.disk_ident.enable="0"
kern.geom.label.gptid.enable="0"
kern.geom.label.ufsid.enable="0"
cpu_microcode_load="YES"
cpu_microcode_name="/boot/firmware/intel-ucode.bin"
cryptodev_load="YES"
fusefs_load="YES"
zfs_load="YES"
if_tap_load="YES"
if_vlan_load="YES"
xen_kernel="/boot/xen-debug"
xen_cmdline="dom0_mem=16G,max:16G dom0=pvh,verbose console=com1,vga com1=115200,8n1 vga=keep iommu=debug guest_loglvl=all loglvl=all"

Can be reduced to only xen parameters without anything else and still produce the same behavior (works in 13, panics in 14).
xen-kernel-4.18.0.20240201 from FreeBSD pkg repositories.

Xen Serial output:

(XEN) Xen version 4.18.1-pre (root@) (FreeBSD clang version 16.0.6 (https://github.com/llvm/llvm-project.git llvmorg-16.0.6-0-g7cbf1a259152)) debug=n Fri Jun 21 03:09:30 UTC 2024
(XEN) Latest ChangeSet:
(XEN) build-id: 55a2d6e0449824df8e256a5bf4aa303975934cd6
(XEN) Bootloader: FreeBSD Loader
(XEN) Command line: dom0_mem=16G,max:16G dom0=pvh,verbose console=com1,vga com1=115200,8n1 vga=keep iommu=debug guest_loglvl=all loglvl=all
(XEN) Xen image load base address: 0
(XEN) Video information:
(XEN)  VGA is graphics mode 1280x1024, 16 bpp
(XEN)  VBE/DDC methods: V2; EDID transfer time: 1 seconds
(XEN) Disc information:
(XEN)  Found 6 MBR signatures
(XEN)  Found 6 EDD information structures
(XEN) CPU Vendor: Intel, Family 6 (0x6), Model 58 (0x3a), Stepping 9 (raw 000306a9)
(XEN) Xen-e820 RAM map:
(XEN)  [0000000000000000, 000000000009d7ff] (usable)
(XEN)  [000000000009d800, 000000000009ffff] (reserved)
(XEN)  [00000000000e0000, 00000000000fffff] (reserved)
(XEN)  [0000000000100000, 00000000ddc24fff] (usable)
(XEN)  [00000000ddc25000, 00000000de0f4fff] (reserved)
(XEN)  [00000000de0f5000, 00000000de0f5fff] (ACPI data)
(XEN)  [00000000de0f6000, 00000000de21ffff] (ACPI NVS)
(XEN)  [00000000de220000, 00000000dea44fff] (reserved)
(XEN)  [00000000dea45000, 00000000dea45fff] (usable)
(XEN)  [00000000dea46000, 00000000dea88fff] (ACPI NVS)
(XEN)  [00000000dea89000, 00000000df46bfff] (usable)
(XEN)  [00000000df46c000, 00000000df7d7fff] (reserved)
(XEN)  [00000000df7d8000, 00000000df7fffff] (usable)
(XEN)  [00000000f8000000, 00000000fbffffff] (reserved)
(XEN)  [00000000fec00000, 00000000fec00fff] (reserved)
(XEN)  [00000000fed00000, 00000000fed03fff] (reserved)
(XEN)  [00000000fed1c000, 00000000fed1ffff] (reserved)
(XEN)  [00000000fee00000, 00000000fee00fff] (reserved)
(XEN)  [00000000ff000000, 00000000ffffffff] (reserved)
(XEN)  [0000000100000000, 000000081effffff] (usable)
(XEN) New Xen image base address: 0xdee00000
(XEN) ACPI: RSDP 000F0490, 0024 (r2 ALASKA)
(XEN) ACPI: XSDT DE201070, 0064 (r1 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: FACP DE20AC78, 00F4 (r4 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: DSDT DE201170, 9B05 (r2 ALASKA    A M I       12 INTL 20051117)
(XEN) ACPI: FACS DE21EF80, 0040
(XEN) ACPI: APIC DE20AD70, 0092 (r3 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: MCFG DE20AE08, 003C (r1 ALASKA    A M I  1072009 MSFT       97)
(XEN) ACPI: HPET DE20AE48, 0038 (r1 ALASKA    A M I  1072009 AMI.        5)
(XEN) ACPI: SSDT DE20AE80, 036D (r1 SataRe SataTabl     1000 INTL 20091112)
(XEN) ACPI: SSDT DE20B1F0, 09AA (r1  PmRef  Cpu0Ist     3000 INTL 20051117)
(XEN) ACPI: SSDT DE20BBA0, 0A92 (r1  PmRef    CpuPm     3000 INTL 20051117)
(XEN) ACPI: DMAR DE20C638, 0080 (r1 INTEL      SNB         1 INTL        1)
(XEN) System RAM: 32725MB (33511224kB)
(XEN) No NUMA configuration found
(XEN) Faking a node at 0000000000000000-000000081f000000
(XEN) Domain heap initialised
(XEN) vesafb: framebuffer at 0x00000000e0000000, mapped to 0xffff82c000201000, using 4096k, total 16384k
(XEN) vesafb: mode is 1280x1024x16, linelength=2560, font 8x16
(XEN) vesafb: Truecolor: size=0:5:5:5, shift=0:10:5:0
(XEN) found SMP MP-table at 000fd7b0
(XEN) DMI 2.7 present.
(XEN) [VT-D]Host address width 36
(XEN) [VT-D]found ACPI_DMAR_DRHD:
(XEN) [VT-D]  dmaru->address = fed90000
(XEN) [VT-D]drhd->address = fed90000 iommu->reg = ffff82c000606000
(XEN) [VT-D]cap = c9008020660262 ecap = f0105a
(XEN) [VT-D] IOAPIC: 0000:f0:1f.0
(XEN) [VT-D] MSI HPET: 0000:f0:0f.0
(XEN) [VT-D]  flags: INCLUDE_ALL
(XEN) [VT-D]found ACPI_DMAR_RMRR:
(XEN) [VT-D] endpoint: 0000:00:1d.0
(XEN) [VT-D] endpoint: 0000:00:1a.0
(XEN) [VT-D] endpoint: 0000:00:14.0
(XEN) Using APIC driver default
(XEN) ACPI: PM-Timer IO Port: 0x408 (24 bits)
(XEN) ACPI: SLEEP INFO: pm1x_cnt[1:404,1:0], pm1x_evt[1:400,1:0]
(XEN) ACPI: 32/64X FACS address mismatch in FADT - de21ef80/0000000000000000, using 32
(XEN) ACPI:             wakeup_vec[de21ef8c], vec_size[20]
(XEN) ACPI: Local APIC address 0xfee00000
(XEN) ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
(XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
(XEN) ACPI: IRQ0 used by override.
(XEN) ACPI: IRQ2 used by override.
(XEN) ACPI: IRQ9 used by override.
(XEN) ACPI: HPET id: 0x8086a701 base: 0xfed00000
(XEN) PCI: MCFG configuration 0: base f8000000 segment 0000 buses 00 - 3f
(XEN) PCI: MCFG area at f8000000 reserved in E820
(XEN) PCI: Using MCFG for segment 0000 bus 00-3f
(XEN) Using ACPI (MADT) for SMP configuration information
(XEN) SMP: Allowing 8 CPUs (0 hotplug CPUs)
(XEN) IRQ limits: 24 GSI, 1640 MSI/MSI-X
(XEN) Switched to APIC driver x2apic_mixed
(XEN) re-enabled NX (Execute Disable) protection
(XEN) CPU0: 1600 ... 3300 MHz
(XEN) xstate: size: 0x340 and states: 0x7
(XEN) CPU0: Intel machine check reporting enabled
(XEN) Speculative mitigation facilities:
(XEN)   Hardware hints:
(XEN)   Hardware features:
(XEN)   Compiled-in support: INDIRECT_THUNK SHADOW_PAGING
(XEN)   Xen settings: BTI-Thunk RETPOLINE, SPEC_CTRL: No, Other: BRANCH_HARDEN
(XEN)   L1TF: believed vulnerable, maxphysaddr L1D 46, CPUID 36, Safe address 1000000000
(XEN)   Support for HVM VMs: RSB EAGER_FPU
(XEN)   Support for PV VMs: EAGER_FPU
(XEN)   XPTI (64-bit PV only): Dom0 enabled, DomU enabled (without PCID)
(XEN)   PV L1TF shadowing: Dom0 disabled, DomU enabled
(XEN) Using scheduler: SMP Credit Scheduler rev2 (credit2)
(XEN) Initializing Credit2 scheduler
(XEN)  load_precision_shift: 18
(XEN)  load_window_shift: 30
(XEN)  underload_balance_tolerance: 0
(XEN)  overload_balance_tolerance: -3
(XEN)  runqueues arrangement: socket
(XEN)  cap enforcement granularity: 10ms
(XEN) load tracking window length 1073741824 ns
(XEN) Platform timer is 14.318MHz HPET
(XEN) Detected 3292.522 MHz processor.
(XEN) Freed 1024kB unused BSS memory
(XEN) alt table ffff82d04044bc48 -> ffff82d04045a292
(XEN) Intel VT-d iommu 0 supported page sizes: 4kB
(XEN) Intel VT-d Snoop Control not enabled.
(XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
(XEN) Intel VT-d Queued Invalidation enabled.
(XEN) Intel VT-d Interrupt Remapping enabled.
(XEN) Intel VT-d Posted Interrupt not enabled.
(XEN) Intel VT-d Shared EPT tables not enabled.
(XEN) I/O virtualisation enabled
(XEN)  - Dom0 mode: Relaxed
(XEN) Interrupt remapping enabled
(XEN) nr_sockets: 1
(XEN) Enabled directed EOI with ioapic_ack_old on!
(XEN) Enabling APIC mode.  Using 1 I/O APICs
(XEN) ENABLING IO-APIC IRQs
(XEN)  -> Using old ACK method
(XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
(XEN) TSC deadline timer enabled
(XEN) Allocated console ring of 64 KiB.
(XEN) mwait-idle: MWAIT substates: 0x1120
(XEN) mwait-idle: v0.4.1 model 0x3a
(XEN) mwait-idle: lapic_timer_reliable_states 0xffffffff
(XEN) VMX: Supported advanced features:
(XEN)  - APIC MMIO access virtualisation
(XEN)  - APIC TPR shadow
(XEN)  - Extended Page Tables (EPT)
(XEN)  - Virtual-Processor Identifiers (VPID)
(XEN)  - Virtual NMI
(XEN)  - MSR direct-access bitmap
(XEN)  - Unrestricted Guest
(XEN) HVM: ASIDs enabled.
(XEN) VMX: Disabling executable EPT superpages due to CVE-2018-12207
(XEN) HVM: VMX enabled
(XEN) HVM: Hardware Assisted Paging (HAP) detected
(XEN) HVM: HAP page sizes: 4kB, 2MB
(XEN) alt table ffff82d04044bc48 -> ffff82d04045a292
(XEN) Brought up 8 CPUs
(XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
(XEN) Initializing Credit2 scheduler
(XEN)  load_precision_shift: 18
(XEN)  load_window_shift: 30
(XEN)  underload_balance_tolerance: 0
(XEN)  overload_balance_tolerance: -3
(XEN)  runqueues arrangement: socket
(XEN)  cap enforcement granularity: 10ms
(XEN) load tracking window length 1073741824 ns
(XEN) Adding cpu 0 to runqueue 0
(XEN)  First cpu on runqueue, activating
(XEN) Adding cpu 1 to runqueue 0
(XEN) Adding cpu 2 to runqueue 0
(XEN) Adding cpu 3 to runqueue 0
(XEN) Adding cpu 4 to runqueue 0
(XEN) Adding cpu 5 to runqueue 0
(XEN) Adding cpu 6 to runqueue 0
(XEN) Adding cpu 7 to runqueue 0
(XEN) mcheck_poll: Machine check polling timer started.
(XEN) NX (Execute Disable) protection active
(XEN) Dom0 has maximum 856 PIRQs
(XEN) *** Building a PVH Dom0 ***
(XEN) [VT-D]d0:Hostbridge: skip 0000:00:00.0 map
(XEN) Bogus DMIBAR 0xfed18001 on 0000:00:00.0
(XEN) [VT-D]d0:PCI: map 0000:00:14.0
(XEN) [VT-D]d0:PCI: map 0000:00:16.0
(XEN) [VT-D]d0:PCI: map 0000:00:1a.0
(XEN) [VT-D]d0:PCIe: map 0000:00:1b.0
(XEN) [VT-D]d0:PCI: map 0000:00:1d.0
(XEN) [VT-D]d0:PCI: map 0000:00:1f.0
(XEN) [VT-D]d0:PCI: map 0000:00:1f.2
(XEN) [VT-D]d0:PCI: map 0000:00:1f.3
(XEN) [VT-D]d0:PCIe: map 0000:01:00.0
(XEN) [VT-D]d0:PCIe: map 0000:01:00.1
(XEN) [VT-D]d0:PCIe: map 0000:05:00.0
(XEN) [VT-D]d0:PCIe: map 0000:06:00.0
(XEN) [VT-D]iommu_enable_translation: iommu->reg = ffff82c000606000
(XEN) Dom0 memory allocation stats:
(XEN) order  0 allocations: 4
(XEN) order  1 allocations: 2
(XEN) order  2 allocations: 4
(XEN) order  3 allocations: 3
(XEN) order  4 allocations: 3
(XEN) order  5 allocations: 5
(XEN) order  6 allocations: 2
(XEN) order  7 allocations: 1
(XEN) order  8 allocations: 2
(XEN) order  9 allocations: 2
(XEN) order 10 allocations: 4
(XEN) order 11 allocations: 3
(XEN) order 12 allocations: 3
(XEN) order 13 allocations: 1
(XEN) order 14 allocations: 2
(XEN) order 15 allocations: 2
(XEN) order 16 allocations: 2
(XEN) order 17 allocations: 2
(XEN) order 18 allocations: 14
(XEN) Dom0 memory map:
(XEN)  [0000000000000000, 000000000009cfff] (usable)
(XEN)  [000000000009d800, 000000000009ffff] (reserved)
(XEN)  [00000000000e0000, 00000000000fffff] (reserved)
(XEN)  [0000000000100000, 00000000ddc24fff] (usable)
(XEN)  [00000000ddc25000, 00000000de0f4fff] (reserved)
(XEN)  [00000000de0f5000, 00000000de0f5fff] (ACPI data)
(XEN)  [00000000de0f6000, 00000000de21ffff] (ACPI NVS)
(XEN)  [00000000de220000, 00000000dea44fff] (reserved)
(XEN)  [00000000dea45000, 00000000dea45fff] (usable)
(XEN)  [00000000dea46000, 00000000dea88fff] (ACPI NVS)
(XEN)  [00000000dea89000, 00000000df46bfff] (usable)
(XEN)  [00000000df46c000, 00000000df7d7fff] (reserved)
(XEN)  [00000000df7d8000, 00000000df7ffebb] (usable)
(XEN)  [00000000df7ffebc, 00000000df7fff87] (ACPI data)
(XEN)  [00000000f8000000, 00000000fbffffff] (reserved)
(XEN)  [00000000fec00000, 00000000fec00fff] (reserved)
(XEN)  [00000000fed00000, 00000000fed03fff] (reserved)
(XEN)  [00000000fed1c000, 00000000fed1ffff] (reserved)
(XEN)  [00000000fee00000, 00000000fee00fff] (reserved)
(XEN)  [00000000ff000000, 00000000ffffffff] (reserved)
(XEN)  [0000000100000000, 0000000421a31fff] (usable)
(XEN)  [0000000421a32000, 000000081effffff] (unusable)
(XEN) WARNING: PVH is an experimental mode with limited functionality
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Scrubbing Free RAM in background
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) ***************************************************
(XEN) Booted on L1TF-vulnerable hardware with SMT/Hyperthreading
(XEN) enabled.  Please assess your configuration and choose an
(XEN) explicit 'smt=<bool>' setting.  See XSA-273.
(XEN) ***************************************************
(XEN) Booted on MLPDS/MFBDS-vulnerable hardware with SMT/Hyperthreading
(XEN) enabled.  Mitigations will not be fully effective.  Please
(XEN) choose an explicit smt=<bool> setting.  See XSA-297.
(XEN) ***************************************************
(XEN) 3... 2... 1...
(XEN) Xen is keeping VGA console.
(XEN) Boot video device 01:00.0
(XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
(XEN) Freed 656kB init memory
unable to add kenv smbios.chassis.version=To Be Filled By O.E.M.
unable to add kenv smbios.planar.location=To be filled by O.E.M.
unable to add kenv smbios.planar.maker=Gigabyte Technology Co., Ltd.
unable to add kenv smbios.planar.serial=To be filled by O.E.M.
unable to add kenv smbios.planar.tag=To be filled by O.E.M.
unable to add kenv smbios.planar.version=x.x
unable to add kenv smbios.socket.enabled=1
unable to add kenv smbios.socket.populated=1
unable to add kenv smbios.system.family=To be filled by O.E.M.
unable to add kenv smbios.system.maker=Gigabyte Technology Co., Ltd.
unable to add kenv smbios.system.product=To be filled by O.E.M.
unable to add kenv smbios.system.serial=To be filled by O.E.M.
unable to add kenv smbios.system.sku=To be filled by O.E.M.
unable to add kenv smbios.system.uuid=032b0290-0434-0518-d206-b80700080009
unable to add kenv smbios.system.version=To be filled by O.E.M.
unable to add kenv smbios.version=2.7
unable to add kenv splash_bmp_load=NO
unable to add kenv splash_pcx_load=NO
unable to add kenv splash_txt_load=NO
unable to add kenv teken.bg_color=0
unable to add kenv teken.fg_color=7
unable to add kenv twiddle_divisor=16
unable to add kenv vbe_max_resolution=1080p
unable to add kenv verbose_loading=NO
unable to add kenv vesa_load=NO
unable to add kenv vfs.root.mountfrom=zfs:freebsd/ROOT/default
unable to add kenv xen_cmdline=dom0_mem=16G,max:16G dom0=pvh,verbose console=com1,vga com1=115200,8n1 vga=keep iommu=debug guest_loglvl=all loglvl=all
unable to add kenv xen_kernel=/boot/xen-debug
unable to add kenv zfs_be_active=zfs:freebsd/ROOT/default
unable to add kenv zfs_be_currpage=1
unable to add kenv zfs_be_root=freebsd/ROOT
unable to add kenv zfs_load=YES
No VBE FB in kernel metadata
(XEN) d0v0: upcall vector 93
(XEN) Error: INIT received - ignoring
Comment 17 Marian Arlt 2024-07-05 19:48:39 UTC
I notice that this minicom capture says it is booting with clang 16.0.6, which was the version in 14.0-RELEASE. But for this capture I excplicitly upgraded from 13.3 directly to 14.1 which uses 18.1.5, and earlier panics also happened booting with 18.1.5
Strange. There's very little consistency here. Kind of upsetting to be honest. I will try to get another capture with clang 18.1.5 and see if it produces the same output with the same config.
Comment 18 Marian Arlt 2024-07-05 20:07:35 UTC
Wow! On a monitor (connected HDMI to GPU) it prints that it's booting with clang 18.1.5 but in the serial output over COM it says it's booting with clang 16.0.6 as seen in the capture above. This is with FreeBSD 14.1-RELEASE-p2
So which one is it? I doubt this is normal?
Comment 19 Marian Arlt 2024-07-05 20:35:08 UTC
Ok last comment. I observe this also on my working 13.3-RELEASE-p3 setup.
clang -v and monitor shows 17.0.6 but the serial xen output mentions 14.0.5 instead. But it still works. (same config as above).
Comment 20 Roger Pau Monné freebsd_committer freebsd_triage 2024-07-08 08:34:57 UTC
Hello,

Thanks for being able to gather this output, this is helpful.

A couple of things I've noticed:

In /boot/loader.conf you have:

xen_kernel="/boot/xen-debug"

Yet in the Xen output you posted:

(XEN) Xen version 4.18.1-pre (root@) (FreeBSD clang version 16.0.6 (https://github.com/llvm/llvm-project.git llvmorg-16.0.6-0-g7cbf1a259152)) debug=n Fri Jun 21 03:09:30 UTC 2024

Notice how 'debug=n' which is what you get from a non-debug build. Is there a mismatch between the contents of your loader.conf and what you used to boot Xen?  Otherwise I cannot understand why your xen-debug image is built with 'debug=n'.

You say you are using 'xen-kernel-4.18.0.20240201', can you try with the newer 'xen-kernel-4.18.2.2024041'?

The 'unable to add kenv ...' messages also seem concerning, as I've never seen them on my setup.

When you mention upgrading from 13.3 to 14.1, do you make sure your packages are also updated?

Finally, could you attempt to set the following in loader.conf in order to get more FreeBSD output on the serial console:

boot_multicons="YES"
boot_serial="YES"

It will be helpful if you can obtain an other serial trace with the above set and using 'xen-kernel-4.18.2.2024041'.  Also remember to set xen_kernel="/boot/xen-debug" in loader.conf.

Thanks, Roger.
Comment 21 Marian Arlt 2024-07-09 16:23:54 UTC
My working setup does indeed say "debug=n" even though it starts with:
Loading Xen kernel...
/boot/xen-debug data=0x2717a4+0x13285c /
There is no configuration mismatch from my end here,
I use vanilla pkg xen-kernel and the xen_cmdline I showed here.
No funny business or customizations or anything.

As per your request I did repeat this again:
- booted my working xen-kernel from pkg repositories with 13.3-RELEASE and config shown earlier
- commented Xen out in /boot/loader.conf
- upgraded good working 13.3-RELEASE-p4 to 14.1-RELEASE-p2 and finished upgrade process
- rebuilt installed packages with pkg-static upgrade -f
- with disabled Xen made sure the machine boots fine, which it did several times
- deinstalled pkg xen-kernel and compiled ports xen-kernel-4.18.2.20240411
  (I usually do not use ports at all. This was the first port ever compiled on this machine)
- enabled Xen and rebooted with combined boot log:
  console="comconsole"
  boot_serial="YES"
  xen_kernel="/boot/xen-debug"
  xen_cmdline="dom0_mem=16G,max:16G dom0=pvh,verbose console=com1 com1=115200,8n1 iommu=debug guest_loglvl=all loglvl=all"

This combination of console settings is the only combination I can get to work that gives me verbose FreeBSD boot and Xen boot log on the same serial. This works fine with my 13.3 setting.
This time around debug=y and the clang version are both correct, but since its just on console, I am not able to observe whether or not there's mismatches between vidconsole and comconsole, like above, since I just use serial at this point.

Thank you for showing interest.

config: -h -S115200

Consoles: serial port  
BIOS drive C: is disk0
BIOS drive D: is disk1
BIOS drive E: is disk2
BIOS drive F: is disk3
BIOS drive G: is disk4
BIOS drive H: is disk5
BIOS 630kB/3629864kB available memory

FreeBSD/x86 bootstrap loader, Revision 1.1
Loading /boot/defaults/loader.conf
Loading /boot/defaults/loader.conf
Loading /boot/device.hints
Loading /boot/loader.conf
Loading /boot/loader.conf.local
Loading Xen kernel...
/boot/xen-debug data=0x2c77d4+0x13282c 
elf32_lookup_symbol: corrupt symbol table
Loading kernel...
/boot/kernel/kernel size=0x1bc46b0
Loading configured modules...
/boot/kernel/zfs.ko | size 0x5cd608 at 0x21c0000
/boot/kernel/cryptodev.ko size 0x77d8 at 0x278e000
/boot/entropy size=0x1000
/boot/firmware/intel-ucode.bin size=0xc55800
/etc/hostid size=0x25
/boot/kernel/fusefs.ko size 0x27478 at 0x33ec000

Hit [Enter] to boot immediately, or any other key for command prompt.
Booting [/boot/kernel/kernel] in 3 seconds... 
Booting [/boot/kernel/kernel] in 2 seconds... 
Booting [/boot/kernel/kernel] in 1 second... 
Booting [/boot/kernel/kernel]...               
 Xen 4.18.2
(XEN) Xen version 4.18.2 (root@) (FreeBSD clang version 18.1.5 (https://github.com/llvm/llvm-project.git llvmorg-18.1.5-0-g617a15a9eac9)) debug=y Tue Jul  9 16:31:43 CEST 2024
(XEN) Latest ChangeSet: 
(XEN) build-id: ae783a8ac576133c9f0c69f7ca3c90ae9635aaf6
(XEN) Bootloader: FreeBSD Loader
(XEN) Command line: dom0_mem=16G,max:16G dom0=pvh,verbose console=com1 com1=115200,8n1 iommu=debug guest_loglvl=all loglvl=all
(XEN) Xen image load base address: 0
(XEN) Video information:
(XEN)  No VGA detected
(XEN) Disc information:
(XEN)  Found 6 MBR signatures
(XEN)  Found 6 EDD information structures
(XEN) CPU Vendor: Intel, Family 6 (0x6), Model 58 (0x3a), Stepping 9 (raw 000306a9)
(XEN) Xen-e820 RAM map:
(XEN)  [0000000000000000, 000000000009d7ff] (usable)
(XEN)  [000000000009d800, 000000000009ffff] (reserved)
(XEN)  [00000000000e0000, 00000000000fffff] (reserved)
(XEN)  [0000000000100000, 00000000dd9c9fff] (usable)
(XEN)  [00000000dd9ca000, 00000000dde99fff] (reserved)
(XEN)  [00000000dde9a000, 00000000dde9afff] (ACPI data)
(XEN)  [00000000dde9b000, 00000000ddfc4fff] (ACPI NVS)
(XEN)  [00000000ddfc5000, 00000000dea44fff] (reserved)
(XEN)  [00000000dea45000, 00000000dea45fff] (usable)
(XEN)  [00000000dea46000, 00000000dea88fff] (ACPI NVS)
(XEN)  [00000000dea89000, 00000000df46bfff] (usable)
(XEN)  [00000000df46c000, 00000000df7d7fff] (reserved)
(XEN)  [00000000df7d8000, 00000000df7fffff] (usable)
(XEN)  [00000000f8000000, 00000000fbffffff] (reserved)
(XEN)  [00000000fec00000, 00000000fec00fff] (reserved)
(XEN)  [00000000fed00000, 00000000fed03fff] (reserved)
(XEN)  [00000000fed1c000, 00000000fed1ffff] (reserved)
(XEN)  [00000000fee00000, 00000000fee00fff] (reserved)
(XEN)  [00000000ff000000, 00000000ffffffff] (reserved)
(XEN)  [0000000100000000, 000000081effffff] (usable)
(XEN) New Xen image base address: 0xdee00000
(XEN) ACPI: RSDP 000F0490, 0024 (r2 ALASKA)
(XEN) ACPI: XSDT DDFA6070, 0064 (r1 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: FACP DDFAFC78, 00F4 (r4 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: DSDT DDFA6170, 9B05 (r2 ALASKA    A M I       12 INTL 20051117)
(XEN) ACPI: FACS DDFC3F80, 0040
(XEN) ACPI: APIC DDFAFD70, 0092 (r3 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: MCFG DDFAFE08, 003C (r1 ALASKA    A M I  1072009 MSFT       97)
(XEN) ACPI: HPET DDFAFE48, 0038 (r1 ALASKA    A M I  1072009 AMI.        5)
(XEN) ACPI: SSDT DDFAFE80, 036D (r1 SataRe SataTabl     1000 INTL 20091112)
(XEN) ACPI: SSDT DDFB01F0, 09AA (r1  PmRef  Cpu0Ist     3000 INTL 20051117)
(XEN) ACPI: SSDT DDFB0BA0, 0A92 (r1  PmRef    CpuPm     3000 INTL 20051117)
(XEN) ACPI: DMAR DDFB1638, 0080 (r1 INTEL      SNB         1 INTL        1)
(XEN) System RAM: 32723MB (33508812kB)
(XEN) No NUMA configuration found
(XEN) Faking a node at 0000000000000000-000000081f000000
(XEN) Domain heap initialised
(XEN) found SMP MP-table at 000fd7b0
(XEN) DMI 2.7 present.
(XEN) [VT-D]Host address width 36
(XEN) [VT-D]found ACPI_DMAR_DRHD:
(XEN) [VT-D]  dmaru->address = fed90000
(XEN) [VT-D]drhd->address = fed90000 iommu->reg = ffff82c000205000
(XEN) [VT-D]cap = c9008020660262 ecap = f0105a
(XEN) [VT-D] IOAPIC: 0000:f0:1f.0
(XEN) [VT-D] MSI HPET: 0000:f0:0f.0
(XEN) [VT-D]  flags: INCLUDE_ALL
(XEN) [VT-D]found ACPI_DMAR_RMRR:
(XEN) [VT-D] endpoint: 0000:00:1d.0
(XEN) [VT-D] endpoint: 0000:00:1a.0
(XEN) [VT-D] endpoint: 0000:00:14.0
(XEN) [VT-D]drivers/passthrough/vtd/dmar.c:615:  RMRR: [dde31000,dde3dfff]
(XEN) Using APIC driver default
(XEN) ACPI: PM-Timer IO Port: 0x408 (24 bits)
(XEN) ACPI: SLEEP INFO: pm1x_cnt[1:404,1:0], pm1x_evt[1:400,1:0]
(XEN) ACPI: 32/64X FACS address mismatch in FADT - ddfc3f80/0000000000000000, using 32
(XEN) ACPI:             wakeup_vec[ddfc3f8c], vec_size[20]
(XEN) ACPI: Local APIC address 0xfee00000
(XEN) ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
(XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
(XEN) ACPI: IRQ0 used by override.
(XEN) ACPI: IRQ2 used by override.
(XEN) ACPI: IRQ9 used by override.
(XEN) ACPI: HPET id: 0x8086a701 base: 0xfed00000
(XEN) PCI: MCFG configuration 0: base f8000000 segment 0000 buses 00 - 3f
(XEN) PCI: MCFG area at f8000000 reserved in E820
(XEN) PCI: Using MCFG for segment 0000 bus 00-3f
(XEN) Using ACPI (MADT) for SMP configuration information
(XEN) SMP: Allowing 8 CPUs (0 hotplug CPUs)
(XEN) IRQ limits: 24 GSI, 1640 MSI/MSI-X
(XEN) [VT-D]drivers/passthrough/vtd/qinval.c:423: QI: using 256-entry ring(s)
(XEN) Switched to APIC driver x2apic_mixed
(XEN) re-enabled NX (Execute Disable) protection
(XEN) CPU0: 1600 ... 3300 MHz
(XEN) xstate: size: 0x340 and states: 0x7
(XEN) arch/x86/cpu/mcheck/mce_intel.c:782: MCA Capability: firstbank 0, extended MCE MSR 0, BCAST, CMCI
(XEN) CPU0: Intel machine check reporting enabled
(XEN) Speculative mitigation facilities:
(XEN)   Hardware hints:
(XEN)   Hardware features:
(XEN)   Compiled-in support: INDIRECT_THUNK SHADOW_PAGING HARDEN_ARRAY HARDEN_BRANCH HARDEN_GUEST_ACCESS HARDEN_LOCK
(XEN)   Xen settings: BTI-Thunk: RETPOLINE, SPEC_CTRL: No, Other: BRANCH_HARDEN
(XEN)   L1TF: believed vulnerable, maxphysaddr L1D 46, CPUID 36, Safe address 1000000000
(XEN)   Support for HVM VMs: RSB EAGER_FPU
(XEN)   Support for PV VMs: EAGER_FPU
(XEN)   XPTI (64-bit PV only): Dom0 enabled, DomU enabled (without PCID)
(XEN)   PV L1TF shadowing: Dom0 disabled, DomU enabled
(XEN) Using scheduler: SMP Credit Scheduler rev2 (credit2)
(XEN) Initializing Credit2 scheduler
(XEN)  load_precision_shift: 18
(XEN)  load_window_shift: 30
(XEN)  underload_balance_tolerance: 0
(XEN)  overload_balance_tolerance: -3
(XEN)  runqueues arrangement: socket
(XEN)  cap enforcement granularity: 10ms
(XEN) load tracking window length 1073741824 ns
(XEN) Platform timer is 14.318MHz HPET
(XEN) Detected 3292.523 MHz processor.
(XEN) Freed 1024kB unused BSS memory
(XEN) alt table ffff82d04049b878 -> ffff82d0404b0568
(XEN) Intel VT-d iommu 0 supported page sizes: 4kB
(XEN) Intel VT-d Snoop Control not enabled.
(XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
(XEN) Intel VT-d Queued Invalidation enabled.
(XEN) Intel VT-d Interrupt Remapping enabled.
(XEN) Intel VT-d Posted Interrupt not enabled.
(XEN) Intel VT-d Shared EPT tables not enabled.
(XEN) I/O virtualisation enabled
(XEN)  - Dom0 mode: Relaxed
(XEN) Interrupt remapping enabled
(XEN) nr_sockets: 1
(XEN) Enabled directed EOI with ioapic_ack_old on!
(XEN) Enabling APIC mode.  Using 1 I/O APICs
(XEN) ENABLING IO-APIC IRQs
(XEN)  -> Using old ACK method
(XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
(XEN) TSC deadline timer enabled
(XEN) Allocated console ring of 64 KiB.
(XEN) mwait-idle: MWAIT substates: 0x1120
(XEN) mwait-idle: v0.4.1 model 0x3a
(XEN) mwait-idle: lapic_timer_reliable_states 0xffffffff
(XEN) VMX: Supported advanced features:
(XEN)  - APIC MMIO access virtualisation
(XEN)  - APIC TPR shadow
(XEN)  - Extended Page Tables (EPT)
(XEN)  - Virtual-Processor Identifiers (VPID)
(XEN)  - Virtual NMI
(XEN)  - MSR direct-access bitmap
(XEN)  - Unrestricted Guest
(XEN) HVM: ASIDs enabled.
(XEN) VMX: Disabling executable EPT superpages due to CVE-2018-12207
(XEN) HVM: VMX enabled
(XEN) HVM: Hardware Assisted Paging (HAP) detected
(XEN) HVM: HAP page sizes: 4kB, 2MB
(XEN) alt table ffff82d04049b878 -> ffff82d0404b0568
(XEN) Brought up 8 CPUs
(XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
(XEN) Initializing Credit2 scheduler
(XEN)  load_precision_shift: 18
(XEN)  load_window_shift: 30
(XEN)  underload_balance_tolerance: 0
(XEN)  overload_balance_tolerance: -3
(XEN)  runqueues arrangement: socket
(XEN)  cap enforcement granularity: 10ms
(XEN) load tracking window length 1073741824 ns
(XEN) Adding cpu 0 to runqueue 0
(XEN)  First cpu on runqueue, activating
(XEN) Adding cpu 1 to runqueue 0
(XEN) Adding cpu 2 to runqueue 0
(XEN) Adding cpu 3 to runqueue 0
(XEN) Adding cpu 4 to runqueue 0
(XEN) Adding cpu 5 to runqueue 0
(XEN) Adding cpu 6 to runqueue 0
(XEN) Adding cpu 7 to runqueue 0
(XEN) mcheck_poll: Machine check polling timer started.
(XEN) Running stub recovery selftests...
(XEN) Fixup #UD[0000]: ffff82d07fffe044 [ffff82d07fffe044] -> ffff82d04038ad66
(XEN) Fixup #GP[0000]: ffff82d07fffe045 [ffff82d07fffe045] -> ffff82d04038ad66
(XEN) Fixup #SS[0000]: ffff82d07fffe044 [ffff82d07fffe044] -> ffff82d04038ad66
(XEN) Fixup #BP[0000]: ffff82d07fffe045 [ffff82d07fffe045] -> ffff82d04038ad66
(XEN) arch/x86/time.c:1291: CMOS aliased at 74, index r/w
(XEN) NX (Execute Disable) protection active
(XEN) Dom0 has maximum 856 PIRQs
(XEN) *** Building a PVH Dom0 ***
(XEN) [VT-D]d0:Hostbridge: skip 0000:00:00.0 map
(XEN) Bogus DMIBAR 0xfed18001 on 0000:00:00.0
(XEN) [VT-D]d0:PCI: map 0000:00:14.0
(XEN) [VT-D]d0:PCI: map 0000:00:16.0
(XEN) [VT-D]d0:PCI: map 0000:00:1a.0
(XEN) [VT-D]d0:PCIe: map 0000:00:1b.0
(XEN) [VT-D]d0:PCI: map 0000:00:1d.0
(XEN) [VT-D]d0:PCI: map 0000:00:1f.0
(XEN) [VT-D]d0:PCI: map 0000:00:1f.2
(XEN) [VT-D]d0:PCI: map 0000:00:1f.3
(XEN) [VT-D]d0:PCIe: map 0000:02:00.0
(XEN) [VT-D]d0:PCIe: map 0000:05:00.0
(XEN) [VT-D]d0:PCIe: map 0000:06:00.0
(XEN) [VT-D]iommu_enable_translation: iommu->reg = ffff82c000205000
(XEN) Dom0 memory allocation stats:
(XEN) order  0 allocations: 4
(XEN) order  1 allocations: 2
(XEN) order  2 allocations: 4
(XEN) order  3 allocations: 5
(XEN) order  4 allocations: 2
(XEN) order  5 allocations: 3
(XEN) order  6 allocations: 3
(XEN) order  7 allocations: 3
(XEN) order  8 allocations: 3
(XEN) order  9 allocations: 1
(XEN) order 10 allocations: 4
(XEN) order 11 allocations: 3
(XEN) order 12 allocations: 3
(XEN) order 13 allocations: 1
(XEN) order 14 allocations: 2
(XEN) order 15 allocations: 2
(XEN) order 16 allocations: 2
(XEN) order 17 allocations: 2
(XEN) order 18 allocations: 14
(XEN) Dom0 memory map:
(XEN)  [0000000000000000, 000000000009cfff] (usable)
(XEN)  [000000000009d800, 000000000009ffff] (reserved)
(XEN)  [00000000000e0000, 00000000000fffff] (reserved)
(XEN)  [0000000000100000, 00000000dd9c9fff] (usable)
(XEN)  [00000000dd9ca000, 00000000dde99fff] (reserved)
(XEN)  [00000000dde9a000, 00000000dde9afff] (ACPI data)
(XEN)  [00000000dde9b000, 00000000ddfc4fff] (ACPI NVS)
(XEN)  [00000000ddfc5000, 00000000dea44fff] (reserved)
(XEN)  [00000000dea45000, 00000000dea45fff] (usable)
(XEN)  [00000000dea46000, 00000000dea88fff] (ACPI NVS)
(XEN)  [00000000dea89000, 00000000df46bfff] (usable)
(XEN)  [00000000df46c000, 00000000df7d7fff] (reserved)
(XEN)  [00000000df7d8000, 00000000df7ffebb] (usable)
(XEN)  [00000000df7ffebc, 00000000df7fff87] (ACPI data)
(XEN)  [00000000f8000000, 00000000fbffffff] (reserved)
(XEN)  [00000000fec00000, 00000000fec00fff] (reserved)
(XEN)  [00000000fed00000, 00000000fed03fff] (reserved)
(XEN)  [00000000fed1c000, 00000000fed1ffff] (reserved)
(XEN)  [00000000fee00000, 00000000fee00fff] (reserved)
(XEN)  [00000000ff000000, 00000000ffffffff] (reserved)
(XEN)  [0000000100000000, 0000000421c8cfff] (usable)
(XEN)  [0000000421c8d000, 000000081effffff] (unusable)
(XEN) WARNING: PVH is an experimental mode with limited functionality
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Scrubbing Free RAM in background
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) ***************************************************
(XEN) Booted on L1TF-vulnerable hardware with SMT/Hyperthreading
(XEN) enabled.  Please assess your configuration and choose an
(XEN) explicit 'smt=<bool>' setting.  See XSA-273.
(XEN) ***************************************************
(XEN) Booted on MLPDS/MFBDS-vulnerable hardware with SMT/Hyperthreading
(XEN) enabled.  Mitigations will not be fully effective.  Please
(XEN) choose an explicit smt=<bool> setting.  See XSA-297.
(XEN) ***************************************************
(XEN) 3... 2... 1... 
(XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
(XEN) Freed 672kB init memory
unable to add kenv smbios.chassis.tag=To Be Filled By O.E.M.
unable to add kenv smbios.chassis.type=Desktop
unable to add kenv smbios.chassis.version=To Be Filled By O.E.M.
unable to add kenv smbios.planar.location=To be filled by O.E.M.
unable to add kenv smbios.planar.maker=Gigabyte Technology Co., Ltd.
unable to add kenv smbios.planar.product=H77-D3H
unable to add kenv smbios.planar.serial=To be filled by O.E.M.
unable to add kenv smbios.planar.tag=To be filled by O.E.M.
unable to add kenv smbios.planar.version=x.x
unable to add kenv smbios.socket.enabled=1
unable to add kenv smbios.socket.populated=1
unable to add kenv smbios.system.family=To be filled by O.E.M.
unable to add kenv smbios.system.maker=Gigabyte Technology Co., Ltd.
unable to add kenv smbios.system.product=To be filled by O.E.M.
unable to add kenv smbios.system.serial=To be filled by O.E.M.
unable to add kenv smbios.system.sku=To be filled by O.E.M.
unable to add kenv smbios.system.uuid=032b0290-0434-0518-d206-b80700080009
unable to add kenv smbios.system.version=To be filled by O.E.M.
unable to add kenv splash_bmp_load=NO
unable to add kenv splash_pcx_load=NO
unable to add kenv splash_txt_load=NO
unable to add kenv twiddle_divisor=16
unable to add kenv vbe_max_resolution=1080p
unable to add kenv verbose_loading=NO
unable to add kenv vesa_load=NO
unable to add kenv vfs.root.mountfrom=zfs:freebsd/ROOT/default
unable to add kenv vfs.zfs.arc_max=4G
unable to add kenv xen_cmdline=dom0_mem=16G,max:16G dom0=pvh,verbose console=com1 com1=115200,8n1 iommu=debug guest_loglvl=all loglvl=all
unable to add kenv xen_kernel=/boot/xen-debug
unable to add kenv zfs_be_active=zfs:freebsd/ROOT/default
unable to add kenv zfs_be_currpage=1
unable to add kenv zfs_be_root=freebsd/ROOT
unable to add kenv zfs_load=YES
Video console type unsupported
---<<BOOT>>---
APIC: Using the MADT enumerator.
Copyright (c) 1992-2023 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 14.1-RELEASE releng/14.1-n267679-10e31f0946d8 GENERIC amd64
FreeBSD clang version 18.1.5 (https://github.com/llvm/llvm-project.git llvmorg-18.1.5-0-g617a15a9eac9)
PPIM 0: PA=0xb8000, VA=0xffffffff83c10000, size=0x8000, mode=0
pmap: large map 8 PML4 slots (4096 GB)
VT(vga): text 80x25
XEN: Hypervisor version 4.18 detected.
(XEN) d0v0: upcall vector 93
Preloaded elf multiboot kernel "/boot/xen-debug" at 0xffffffff8358d000.
Preloaded elf kernel "/boot/kernel/kernel" at 0xffffffff8358d1b8.
Preloaded elf obj module "/boot/kernel/zfs.ko" at 0xffffffff83595360.
Preloaded elf obj module "/boot/kernel/cryptodev.ko" at 0xffffffff83595bc8.
Preloaded boot_entropy_cache "/boot/entropy" at 0xffffffff835963b8.
Preloaded cpu_microcode "/boot/firmware/intel-ucode.bin" at 0xffffffff83596410.
Preloaded hostuuid "/etc/hostid" at 0xffffffff83596470.
Preloaded elf obj module "/boot/kernel/fusefs.ko" at 0xffffffff835964c0.
Preloaded TSLOG data "TSLOG" at 0xffffffff83596d28.
CPU microcode: no matching update found
CPU: Intel(R) Xeon(R) CPU E3-1230 V2 @ 3.30GHz (3292.52-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x306a9  Family=0x6  Model=0x3a  Stepping=9
  Features=0x1fc3fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT>
  Features2=0xbfba2203<SSE3,PCLMULQDQ,SSSE3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,HV>
  AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
  AMD Features2=0x1<LAHF>
  Structured Extended Features=0x281<FSGSBASE,SMEP,ERMS>
  Structured Extended Features3=0x20000000<ARCH_CAP>
  XSAVE Features=0x1<XSAVEOPT>
  IA32_ARCH_CAPS=0xc000000
  AMD Extended Feature Extensions ID EBX=0x100000
  TSC: P-state invariant
Data TLB0: 2-MByte or 4 MByte pages, 4-way set associative, 32 entries
Data TLB: 4 KB pages, 4-way set associative, 64 entries
Instruction TLB: 2M/4M pages, fully associative, 8 entries
Instruction TLB: 4KByte pages, 4-way set associative, 64 entries
64-Byte prefetching
Shared 2nd-Level TLB: 4 KByte pages, 4-way associative, 512 entries
L2 cache: 256 kbytes, 8-way associative, 64 bytes/line
Hypervisor: Origin = "XenVMMXenVMM"
real memory  = 17746677760 (16924 MB)
Physical memory chunk(s):
0x0000000000001000 - 0x000000000009cfff, 638976 bytes (156 pages)
0x0000000000100000 - 0x00000000001fffff, 1048576 bytes (256 pages)
0x0000000003801000 - 0x00000000dd9c9fff, 3659304960 bytes (893385 pages)
0x00000000dea45000 - 0x00000000dea45fff, 4096 bytes (1 pages)
0x00000000dea89000 - 0x00000000df46bfff, 10366976 bytes (2531 pages)
0x00000000df7d8000 - 0x00000000df7fefff, 159744 bytes (39 pages)
0x0000000100001000 - 0x0000000406ab4fff, 12996788224 bytes (3173044 pages)
0x0000000421a00000 - 0x0000000421bf4fff, 2052096 bytes (501 pages)
avail memory = 16616108032 (15846 MB)
MADT: Found CPU APIC ID 0 ACPI ID 1: enabled
SMP: Added CPU 0 (AP)
MADT: Found CPU APIC ID 2 ACPI ID 2: enabled
SMP: Added CPU 2 (AP)
MADT: Found CPU APIC ID 4 ACPI ID 3: enabled
SMP: Added CPU 4 (AP)
MADT: Found CPU APIC ID 6 ACPI ID 4: enabled
SMP: Added CPU 6 (AP)
MADT: Found CPU APIC ID 1 ACPI ID 5: enabled
SMP: Added CPU 1 (AP)
MADT: Found CPU APIC ID 3 ACPI ID 6: enabled
SMP: Added CPU 3 (AP)
MADT: Found CPU APIC ID 5 ACPI ID 7: enabled
SMP: Added CPU 5 (AP)
MADT: Found CPU APIC ID 7 ACPI ID 8: enabled
SMP: Added CPU 7 (AP)
Event timer "LAPIC" quality 100
ACPI APIC Table: <ALASKA A M I>
Package ID shift: 4
L3 cache ID shift: 4
L2 cache ID shift: 1
L1 cache ID shift: 1
Core ID shift: 1
AP boot address 0x1000
panic: AP #1 (PHY# 1) failed!
cpuid = 0
time = 1
KDB: stack backtrace:
#0 0xffffffff80b7fbfd at kdb_backtrace+0x5d
#1 0xffffffff80b32961 at vpanic+0x131
#2 0xffffffff80b32823 at panic+0x43
#3 0xffffffff80fe22c2 at start_all_aps+0x592
#4 0xffffffff80fe1d23 at cpu_mp_start+0x1a3
#5 0xffffffff80b936de at topo_analyze+0x42e
#6 0xffffffff80abb425 at mi_startup+0xb5
#7 0xffffffff80fde0bc at xen_start32+0xbc
Uptime: 1s
Rebooting...
(XEN) Error: INIT received - ignoring
Comment 22 Marian Arlt 2024-07-09 16:38:50 UTC
For comparison here's a log of the same setup, but working, with FreeBSD 13.3-RELEASE-p4 and xen-kernel-4.18.0.20240201 from pkg repos with the exact same xen_cmdline:

https://www.dropbox.com/scl/fi/s9gpfble925n4fpn5dv95/20240709_FreeBSD-13.3-RELEASE-p3_Working_Xen-Debug_Serial-Log.txt?rlkey=8dx0qerv84ih3wp8ao84c42jf&st=m405zmby&dl=0
Comment 23 Roger Pau Monné freebsd_committer freebsd_triage 2024-07-11 13:09:39 UTC
Thanks for all the information.  I've attempted to reproduce on my end, but so far I haven't been able.  I also no longer have a setup that boots from BIOS (not sure if this is relevant or not in order to reproduce).

I've realized the version of LLVM I'm using is 17.0.6, which is older than the one used to build the xen-kernel you seem to be having issues with.  I've uploaded a build of Xen using my current toolstack, which works for me.  Is there any chance you could give it a try and post the serial output?  It's just a matter of replacing your /boot/xen-debug binary with this one:

https://people.freebsd.org/~royger/xen-debug

In the meantime I will attempt to update my LLVM compiler and toolchain to 18.1.5 and see if I can reproduce.

Roger.
Comment 24 Marian Arlt 2024-07-11 19:29:45 UTC
Truth be told I'd much rather prefer to UEFI boot, but everywhere I look, guides and docs still say that Xen on FreeBSD is supposedly BIOS only...This is definitely on my bucket list...needs some planning ahead though, which I'm currently not really excited about. Especially since it was working just fine before the upgrade to 14.x

For now I did as you proposed. The following log is with latest 14.1-RELEASE-p2 booting your linked debug-kernel. Absolutely no other changes from the above config.

https://www.dropbox.com/scl/fi/13r5oichz29p0ge6mb9b6/20240711_FreeBSD-14.1-RELEASE_Xen_Panic_RogerDebugKernel.log?rlkey=ylo3sjgps1fjkbgafrx415cqw&st=sxbnsx11&dl=0
Comment 25 Roger Pau Monné freebsd_committer freebsd_triage 2024-07-12 15:09:04 UTC
(In reply to Marian Arlt from comment #24)
Oh, I've forgot to update the handbook after adding support for UEFI booting in 14.  UEFI booting should be working now (in fact that's what I use on my systems currently).

The "unable to add kenv" makes me think there's some issue with the handling of the boot metadata that the FreeBSD Xen entry point performs when booted from BIOS instead of UEFI.  I will attempt to find and setup a system with BIOS boot early next week to see if I can reproduce your issues.

Could you also add the following to /boot/loader.conf:

boot_verbose="YES"

And "console_timestamps=boot" to your xen_cmdline.

In the meantime I've built a Xen kernel with extra debug prints related to CPU bringup, maybe that can shed some light:

https://people.freebsd.org/~royger/xen-debug
sha256 ee23e1dd3008117feb0cd872d23d5e4bc4795e33320d05398e83cffba97869ab 

I'm sorry for you being in this situation, I might not be able to do much progress during the weekend, but I will get back to it on Monday.  Thanks for providing all this output.
Comment 26 Marian Arlt 2024-07-14 13:00:30 UTC
On the contrary, I appreciate your continued assistance. I'm a huge fan of baremetal Xen on FreeBSD. If this turns out to be more than one individual case I'd be glad to help. Otherwise I'd be sorry to waste your time. Providing and encouraging UEFI boot going into the future is great news.

All former captures were already using "boot_verbose=YES" in /boot/loader.conf
I added timestamps to xen_cmdline as requested and used the kernel you provided. I am still trying to figure out what the best method is to share the logs here. They'd totally annoy the hell out of me if I pasted them directly to this thread.

I made a Jumpshare account for this one. FreeBSD 14.1-RELEASE-p2 booting xen-debug kernel by Roger Pau Monné with additional "debug prints related to CPU bringup", high verbosity for FreeBSD and Xen boot, and timestamps for Xen boot log, over serial:
https://jumpshare.com/v/8Pd2sRzU5EMxFzvHc6DO
Comment 27 Roger Pau Monné freebsd_committer freebsd_triage 2024-07-15 16:18:51 UTC
Thanks, that log doesn't show any of the added messages, which is concerning because the added messages are in the paths used inside of Xen to handle the IPIs related to CPU bringup.

Can you try to boot adding dom0_max_vcpus=1 to the xen_cmdline option in loader.conf and paste the log here?

I've ordered an adapter I was missing to bring back one of my old Intel test boxes, as what I currently use for testing can only be booted in UEFI mode.

In the meantime I've attempted to reproduce with QEMU but I haven't been able to, I assume there's something hardware dependent.
Comment 28 Marian Arlt 2024-07-16 17:18:03 UTC
To summarize this run:
Legacy BIOS booting FreeBSD 14.1-RELEASE-p2 on an Intel Xeon E3-1230V2 socketed in a Gigabyte H77-D3H board. Using the last provided debug Xen kernel with additional CPU bringup output from Roger. Logging output over serial.

Relevant settings in /boot/loader.conf
console="comconsole"
boot_serial="YES"
boot_verbose="YES"
comconsole_speed="115200"
cpu_microcode_load="YES"
cpu_microcode_name="/boot/firmware/intel-ucode.bin"
if_tap_load="YES"
vfs.zfs.arc_max="4G"
xen_kernel="/boot/xen-debug"
xen_cmdline="dom0_mem=16G,max:16G dom0_max_vcpus=1 dom0=pvh,verbose console=com1 com1=115200,8n1 iommu=debug guest_loglvl=all loglvl=all console_timestamps=boot"

Relevant settings in /etc/rc.conf
rc_info="YES"
rc_debug="YES"
xencommons_enable="YES"

Relevant settings in /etc/sysctl.conf
vm.max_user_wired=-1

Relevant settings in /boot.config
-h -S115200

Resulting panic/log: https://codefile.io/f/lq97fJI7s8

I did some further experiments. I installed a fresh image of latest official 14.1-RELEASE ISO on a spare SSD using UEFI/GPT. Made sure it booted fine. Installed xen-kernel and xen-tools from pkg. Configured the recommended usual parameters that are printed when issuing pkg info -D xen-kernel. Rebooted. This also caused a panic! And a very early one at that:

Loading /boot/loader.conf.local
Loading Xen kernel...
/boot/xen-debug data=0x2d6754+0x1328ac
Loading kernel...
/boot/kernel/kernel size=0x1bc46b0
Loading configured modules...
/boot/kernel/zfs.ko size 0x5cd608 at 0x21cf000
/etc/hostid size=0x25
/boot/entropy size=0x1000
/boot/firmware/intel-ucode.bin size=0xc60400
/boot/kernel/cryptodev.ko size 0x77d8 at 0x33fe000

Hit [Enter] to boot immediately, or any other key for command prompt.
Booting [/boot/kernel/kernel]...
EFI framebuffer information:
addr, size     0xe0000000, 0x300000
dimensions     1024 x 768
stride         1024
masks          0x00ff0000, 0x0000ff00, 0x000000ff, 0xff000000
ERROR: Type:2; Severity:90; Class:3; Subclass:5; Operation: 100D

The ERROR can be decoded further as:
Type 2 = Error Code (1 being Progress and 3 being Debug)
Severity 90 = Unrecovered
Class 3 = Software
Subclass 5 = DXE Boot Driver (based on Class 3)
Operation 100D = ?
I could not read from what little information I found what the operation is supposed to mean.

The funny thing is that I then swapped this disk and took it to my workstation, where I run a i5-12600K socketed into an MSI PRO-Z690A board. And the disk booted just fine, using UEFI Xen with default settings on 14.1-RELEASE. Xen tools were usable and everything looked good!

The not so funny thing is that I then tried to swap my servers SSD too (the 14.1 BIOS Xen I'm discussing here), activated CSM for legacy support in the MSI board and tried to boot this on the i5-12600K which brought up the same panic I'm logging this whole time on the Xeon machine. I tried to play around with settings in the EFI and also the Xen settings, both to no avail.

That's a little side information that I'm not sure of, if it's any useful to the issue, but I thought I'd mention it.
Comment 29 Roger Pau Monné freebsd_committer freebsd_triage 2024-07-18 09:47:35 UTC
Created attachment 252140 [details]
Fix candidate for 14

Possible fix rebased on stable/14
Comment 30 Roger Pau Monné freebsd_committer freebsd_triage 2024-07-18 09:52:04 UTC
Hello,

Thanks again for all the information, this has been very helpful, in fact I have a patch candidate for you to try.  You will have to apply the patch to the FreeBSD kernel sources, and rebuild the kernel.  The easiest way for you to test this would be to fetch the sources for the stable/14 branch, apply the patch, execute `make -jX kernel` and reboot.

Regarding the error that you get when booting from UEFI, Xen itself is a bit strict with UEFI implementations, this hardware being from ~2015 I assume the implementation might not be the best one.  If you want we can debug the UEFFI issue further, but it would be good if we first can confirm that booting from BIOS is back to a working state.

Thanks, Roger.
Comment 31 Marian Arlt 2024-07-18 18:22:33 UTC
Roger you're a genius. I can confirm that the proposed patch resolved my issue.

I did as requested:
  - checked out STABLE and applied the proposed changes
  - compiled world and kernel according to handbook
  - rebooted without xen enabled to make sure the system would boot regularly
  - rebooted with your last xen-debug kernel and got all the way to a login prompt
    - at this point the dom0 was broken though, since it couldn't be named
    - log of this run found here: https://codefile.io/f/L3DBkAFdFi
    - see line 2168 of that log or search for "error"
    - this made xl unoperational
  - rebooted with the original xen-debug kernel from pkg repos and everything was fine
    - log of this run found here: https://codefile.io/f/T821TTCUVu
  - rebooted with the original xen kernel from pkg repos and everything was fine

I was a bit confused at first since all the (XEN) output seemed to not send new lines or something, so I ended up with what looked like a frozen console, but realizing it's serial, I just hit ENTER and got a prompt. All good.

I am now currently running 14.1-STABLE custom kernel with your patch and xen-kernel-4.18.2.20240411. DomU's all up and running swell.

Thank you so much for having the patience with me AND providing the solution. I'm positive this might benefit other folks salvaging decent hardware too.

I suppose I will have to stick with STABLE for now and keep track of RELEASE change logs?
Comment 32 Marian Arlt 2024-07-18 21:13:45 UTC
Well. I was excited fast. And the system is booting reliably. But I actually was fast to say that all DomU guests work well. I have very mixed experiences. Debian 12 will get stuck after GRUB, Windows Server 2022 will boot infinitely and Ubuntu 22 LTS starts fine. But I also swapped some xen-kernel files at this point. Do you have instructions on what combination of what xen kernel I need to use at this point? With your specific patch? Should I re-compile xen from ports at this point to be on the safe side?
Comment 33 Marian Arlt 2024-07-18 22:16:51 UTC
Removed current xen kernel and tools, pulled latest git changes for /usr/ports, did a "pkg-static upgrade -f", rebuilt xen-kernel and xen-tools from ports (xen-tools had a build error for wrong python file names in 3 files [fixed by deleting a single "d" character from those files]).
Rebooted and for now this looks good, also after rebooting several times. The mentioned machines (DomU) boot fine and work like they did in 13.
Comment 34 Roger Pau Monné freebsd_committer freebsd_triage 2024-07-19 07:19:31 UTC
(In reply to Marian Arlt from comment #33)
Hello,

I think you just ended with a mismatch between the Xen kernel and the Xen tools you had installed, and that likely caused the guest creation/runtime issues.

Glad to see it's solved now, using the binary packages from pkg should also work fine.  I have to update the ports to catch up with the last security fixes.

Will submit the patch I provided you for review, and backport it to the stable/14 branch.  Hopefully it should be present in 14.2.

Thanks for you help in debugging this.
Comment 35 commit-hook freebsd_committer freebsd_triage 2024-08-02 10:43:26 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=1b5e5ff68c2bafae55a5adab42f22039bbbce59c

commit 1b5e5ff68c2bafae55a5adab42f22039bbbce59c
Author:     Roger Pau Monné <royger@FreeBSD.org>
AuthorDate: 2024-07-18 08:14:28 +0000
Commit:     Roger Pau Monné <royger@FreeBSD.org>
CommitDate: 2024-08-02 10:41:52 +0000

    xen/pvh: fix initialization of environment

    Xen PVH entry point requires to modify the environment provided by the boot
    loader, so that the ACPI RSDP is re-written to use the Xen generated RSDP
    instead of the native one.

    The current logic in the PVH entry point reserves a single page (4K) in order
    to copy the contents of the environment passed from the boot loader, so that
    the bootloader provided "acpi.rsdp" is dropped and a Xen specific one is added
    afterwards.

    This however doesn't scale well, as it's possible for the environment to be
    bigger than 4K.  Bumping the buffer, or attempting to peek at the size of the
    metadata all seem to just add more complexity to a sensitive path.  Instead
    introduce a new ACPI hook that allows setting the RSDP address directly, and
    use it from the PVH entry point to set the position of the Xen generated RSDP.

    This allows to reduce the logic in the PVH metadata processing, as there's no
    need to parse and filter the bootloader provided environment.

    Note that modifying the environment blob in-place is likely to not work.  The
    RSDP address is provided as a string, it's possible the new RSDP location is
    higher than the current one, and the string with the new location would overrun
    the space used by the previous one.

    Sponsored by: Cloud Software Group
    PR: 277200
    MFC: 3 days
    Reviewed by: markj kib
    Differential revision: https://reviews.freebsd.org/D46089

 sys/x86/acpica/OsdEnvironment.c  | 12 +++++--
 sys/x86/include/acpica_machdep.h |  2 ++
 sys/x86/xen/pv.c                 | 74 +++++++---------------------------------
 3 files changed, 25 insertions(+), 63 deletions(-)