The patch physmem: add ram0 pseudo-driver (commit e6cf1a0826c9d7f229e41224ec7b783501636528) causes panic when booting arm64 FreeBSD on Hyper-V. Panic stack looks like Booting [/boot/kernel/kernel]... |No valid device tree blob found! WARNING! Trying to fire up the kernel, but no device tree blob found! EFI framebuffer information: addr, size 0x40000000, 0x800000 dimensions 1024 x 768 stride 1024 masks 0x00ff0000, 0x0000ff00, 0x000000ff, 0xff000000 ---<<BOOT>>--- GDB: no debug ports present KDB: debugger backends: ddb KDB: current backend: ddb Copyright (c) 1992-2023 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 14.0-CURRENT #13 main-n260755-e6cf1a0826c9-dirty: Mon Mar 6 10:21:42 UTC 2023 root@fbsd13-nvme-test:/data/ws/obj/data/ws/main/arm64.aarch64/sys/GENERIC arm64 FreeBSD clang version 15.0.7 (https://github.com/llvm/llvm-project.git llvmorg-15.0.7-0-g8dfdcc7b7bf6) WARNING: WITNESS option enabled, expect reduced performance. SRAT: Ignoring memory at addr 0x1c0000000 SRAT: Ignoring memory at addr 0x1000000000 SRAT: Ignoring memory at addr 0x10000000000 SRAT: Ignoring memory at addr 0x20000000000 SRAT: Ignoring memory at addr 0x40000000000 SRAT: Ignoring memory at addr 0x80000000000 SRAT: Ignoring memory at addr 0x100000000000 SRAT: Ignoring memory at addr 0x200000000000 SRAT: Ignoring memory at addr 0x400000000000 SRAT: Ignoring memory at addr 0x800000000000 VT(efifb): resolution 1024x768 module scmi already present! module firmware already present! real memory = 4294799360 (4095 MB) avail memory = 4155047936 (3962 MB) Starting CPU 1 (1) FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs arc4random: WARNING: initial seeding bypassed the cryptographic random device because it was not yet seeded and the knob 'bypass_before_seeding' was enabled. random: entropy device external interface MAP 3ec84000 mode 2 pages 41 MAP 3fd2d000 mode 2 pages 48 MAP 3fd5d000 mode 2 pages 36 MAP effed000 mode 0 pages 1 kbd0 at kbdmux0 acpi0: <VRTUAL MICROSFT> acpi0: Could not update all GPEs: AE_NOT_CONFIGURED psci0: <ARM Power State Co-ordination Interface Driver> on acpi0 gic0: <ARM Generic Interrupt Controller v3.0> iomem 0xffff0000-0x10000ffff,0xeffee000-0xf000dfff,0xf000e000-0xf002dfff on acpi0 generic_timer0: <ARM Generic Timer> irq 4,5,6 on acpi0 Timecounter "ARM MPCore Timecounter" frequency 25000000 Hz quality 1000 Event timer "ARM MPCore Eventtimer" frequency 25000000 Hz quality 1000 efirtc0: <EFI Realtime Clock> efirtc0: registered as a time-of-day clock, resolution 1.000000s panic: ram_attach: resource 7 failed to attach cpuid = 0 time = 1 KDB: stack backtrace: db_trace_self() at db_trace_self db_trace_self_wrapper() at db_trace_self_wrapper+0x30 vpanic() at vpanic+0x13c panic() at panic+0x44 ram_attach() at ram_attach+0x1ac device_attach() at device_attach+0x3f8 device_probe_and_attach() at device_probe_and_attach+0x7c bus_generic_new_pass() at bus_generic_new_pass+0xfc bus_generic_new_pass() at bus_generic_new_pass+0xac bus_set_pass() at bus_set_pass+0x4c mi_startup() at mi_startup+0x1fc virtdone() at virtdone+0x6c KDB: enter: panic [ thread pid 0 tid 100000 ] Stopped at kdb_enter+0x44: undefined f906c27f db>
Can you share a boot -v (boot_verbose=YES in loader.conf/loader) dmesg, please?
(In reply to Kyle Evans from comment #1) This, as well as the output of "show all rman" at the ddb prompt, if possible.
There was a later fix that ;ile;y should be tested to see if it avoids the problem: Wed, 15 Mar 2023 . . . git: 8937bd37d07c - main - arm64: limit EFI excluded regions to physical memory types Mitchell Horne
(In reply to Mark Millard from comment #3) Sorry; ";ile;y" should have been "likely".
(In reply to Mark Millard from comment #3) Yes, I was debugging with someone else that did already have the follow-up
Created attachment 241081 [details] Console output, 3/23 CURRENT, Azure Standard D48ps v5, no boot verbose
(In reply to Mark Millard from comment #3) The problem (panic) still exists on 3/23 CURRENT build. See the attached serial console log with 'show all rman' output. I will upload one with boot verbose enabled shortly.
Created attachment 241087 [details] Console output, 3/23 CURRENT, Azure Standard D48ps v5, boot verbose with show all rman This is the one with verbose boot.
(In reply to Wei Hu from comment #8) Looks to me like the following might be being classifieed as conflicting via the overlapping ranges: (gic0: <ARM Generic Interrupt Controller v3.0> iomem) 0xffff0000 -0x10000ffff 0x100000000- 0xfc0000000 (ram0: reserving memory region:) (Not that I'm expert at these issues.)
(In reply to Wei Hu from comment #8) Thank you very much. I believe I was able to identify the problem based on the overlapping ranges of gic0 and physmem. I posted a review to address this. Please test the change if you are able, and let me know if it solves the problem. https://reviews.freebsd.org/D39260 Mitchell
(In reply to Mitchell Horne from comment #10) I tested this patch on the same VM in Azure. It fixes the problem. Thanks for fixing it.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=3462c371c2562a8144f4245f9967df99874e505f commit 3462c371c2562a8144f4245f9967df99874e505f Author: Mitchell Horne <mhorne@FreeBSD.org> AuthorDate: 2023-03-31 15:32:39 +0000 Commit: Mitchell Horne <mhorne@FreeBSD.org> CommitDate: 2023-03-31 16:26:22 +0000 arm64/gicv3: correct the size of the distributor resource Use the GICD_SIZE macro (0x10000), which is half the size of the current fixed-sized mapping (128 * 1024 == 0x20000). In ARM64 Hyper-V instances, it seems the Distributor's registers are located immediately preceding a range of physical memory in the bus address space. Thus, when ram0 is attaching and attempts to reserve SYS_RES_MEMORY resources corresponding to its physmem ranges, it fails, because the first 0x10000 bytes of this range are already owned by gic0. PR: 270415 Reported by: whu Tested by: whu Differential Revision: https://reviews.freebsd.org/D39260 sys/arm64/arm64/gic_v3_acpi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=98ee3bb87a7a46b2ecae96159258632b8f6f3520 commit 98ee3bb87a7a46b2ecae96159258632b8f6f3520 Author: Mitchell Horne <mhorne@FreeBSD.org> AuthorDate: 2023-03-31 15:32:39 +0000 Commit: Mitchell Horne <mhorne@FreeBSD.org> CommitDate: 2023-04-12 17:08:38 +0000 arm64/gicv3: correct the size of the distributor resource Use the GICD_SIZE macro (0x10000), which is half the size of the current fixed-sized mapping (128 * 1024 == 0x20000). In ARM64 Hyper-V instances, it seems the Distributor's registers are located immediately preceding a range of physical memory in the bus address space. Thus, when ram0 is attaching and attempts to reserve SYS_RES_MEMORY resources corresponding to its physmem ranges, it fails, because the first 0x10000 bytes of this range are already owned by gic0. PR: 270415 Reported by: whu Tested by: whu Differential Revision: https://reviews.freebsd.org/D39260 (cherry picked from commit 3462c371c2562a8144f4245f9967df99874e505f) sys/arm64/arm64/gic_v3_acpi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)