Bug 270415 - ram0 pseudo-driver breaks ARM64 on Hyper-V on CURRENT
Summary: ram0 pseudo-driver breaks ARM64 on Hyper-V on CURRENT
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: arm (show other bugs)
Version: CURRENT
Hardware: arm64 Any
: --- Affects Some People
Assignee: freebsd-arm (Nobody)
URL: https://reviews.freebsd.org/D39260
Keywords:
Depends on:
Blocks:
 
Reported: 2023-03-23 07:20 UTC by Wei Hu
Modified: 2023-04-12 17:11 UTC (History)
7 users (show)

See Also:


Attachments
Console output, 3/23 CURRENT, Azure Standard D48ps v5, no boot verbose (9.03 KB, text/plain)
2023-03-24 08:44 UTC, Wei Hu
no flags Details
Console output, 3/23 CURRENT, Azure Standard D48ps v5, boot verbose with show all rman (20.68 KB, text/plain)
2023-03-24 14:13 UTC, Wei Hu
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Wei Hu 2023-03-23 07:20:26 UTC
The patch physmem: add ram0 pseudo-driver (commit e6cf1a0826c9d7f229e41224ec7b783501636528) causes panic when booting arm64 FreeBSD on Hyper-V. Panic stack looks like

Booting [/boot/kernel/kernel]...               
|No valid device tree blob found!
WARNING! Trying to fire up the kernel, but no device tree blob found!
EFI framebuffer information:
addr, size     0x40000000, 0x800000
dimensions     1024 x 768
stride         1024
masks          0x00ff0000, 0x0000ff00, 0x000000ff, 0xff000000
---<<BOOT>>---
GDB: no debug ports present
KDB: debugger backends: ddb
KDB: current backend: ddb
Copyright (c) 1992-2023 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
    The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 14.0-CURRENT #13 main-n260755-e6cf1a0826c9-dirty: Mon Mar  6 10:21:42 UTC 2023
    root@fbsd13-nvme-test:/data/ws/obj/data/ws/main/arm64.aarch64/sys/GENERIC arm64
FreeBSD clang version 15.0.7 (https://github.com/llvm/llvm-project.git llvmorg-15.0.7-0-g8dfdcc7b7bf6)
WARNING: WITNESS option enabled, expect reduced performance.
SRAT: Ignoring memory at addr 0x1c0000000
SRAT: Ignoring memory at addr 0x1000000000
SRAT: Ignoring memory at addr 0x10000000000
SRAT: Ignoring memory at addr 0x20000000000
SRAT: Ignoring memory at addr 0x40000000000
SRAT: Ignoring memory at addr 0x80000000000
SRAT: Ignoring memory at addr 0x100000000000
SRAT: Ignoring memory at addr 0x200000000000
SRAT: Ignoring memory at addr 0x400000000000
SRAT: Ignoring memory at addr 0x800000000000
VT(efifb): resolution 1024x768
module scmi already present!
module firmware already present!
real memory  = 4294799360 (4095 MB)
avail memory = 4155047936 (3962 MB)
Starting CPU 1 (1)
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
arc4random: WARNING: initial seeding bypassed the cryptographic random device because it was not yet seeded and the knob 'bypass_before_seeding' was enabled.
random: entropy device external interface
MAP 3ec84000 mode 2 pages 41
MAP 3fd2d000 mode 2 pages 48
MAP 3fd5d000 mode 2 pages 36
MAP effed000 mode 0 pages 1
kbd0 at kbdmux0
acpi0: <VRTUAL MICROSFT>
acpi0: Could not update all GPEs: AE_NOT_CONFIGURED
psci0: <ARM Power State Co-ordination Interface Driver> on acpi0
gic0: <ARM Generic Interrupt Controller v3.0> iomem 0xffff0000-0x10000ffff,0xeffee000-0xf000dfff,0xf000e000-0xf002dfff on acpi0
generic_timer0: <ARM Generic Timer> irq 4,5,6 on acpi0
Timecounter "ARM MPCore Timecounter" frequency 25000000 Hz quality 1000
Event timer "ARM MPCore Eventtimer" frequency 25000000 Hz quality 1000
efirtc0: <EFI Realtime Clock>
efirtc0: registered as a time-of-day clock, resolution 1.000000s
panic: ram_attach: resource 7 failed to attach
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x13c
panic() at panic+0x44
ram_attach() at ram_attach+0x1ac
device_attach() at device_attach+0x3f8
device_probe_and_attach() at device_probe_and_attach+0x7c
bus_generic_new_pass() at bus_generic_new_pass+0xfc
bus_generic_new_pass() at bus_generic_new_pass+0xac
bus_set_pass() at bus_set_pass+0x4c
mi_startup() at mi_startup+0x1fc
virtdone() at virtdone+0x6c
KDB: enter: panic
[ thread pid 0 tid 100000 ]
Stopped at      kdb_enter+0x44: undefined       f906c27f
db>
Comment 1 Kyle Evans freebsd_committer freebsd_triage 2023-03-23 14:04:36 UTC
Can you share a boot -v (boot_verbose=YES in loader.conf/loader) dmesg, please?
Comment 2 Mitchell Horne freebsd_committer freebsd_triage 2023-03-23 14:10:46 UTC
(In reply to Kyle Evans from comment #1)

This, as well as the output of "show all rman" at the ddb prompt, if possible.
Comment 3 Mark Millard 2023-03-23 14:32:09 UTC
There was a later fix that ;ile;y should be tested to see if it avoids the problem:

Wed, 15 Mar 2023
. . .
git: 8937bd37d07c - main - arm64: limit EFI excluded regions to physical memory types Mitchell Horne
Comment 4 Mark Millard 2023-03-23 14:33:42 UTC
(In reply to Mark Millard from comment #3)

Sorry; ";ile;y" should have been "likely".
Comment 5 Kyle Evans freebsd_committer freebsd_triage 2023-03-23 14:37:10 UTC
(In reply to Mark Millard from comment #3)

Yes, I was debugging with someone else that did already have the follow-up
Comment 6 Wei Hu 2023-03-24 08:44:07 UTC
Created attachment 241081 [details]
Console output, 3/23 CURRENT, Azure Standard D48ps v5, no boot verbose
Comment 7 Wei Hu 2023-03-24 08:46:17 UTC
(In reply to Mark Millard from comment #3)
The problem (panic) still exists on 3/23 CURRENT build. See the attached serial console log with 'show all rman' output. I will upload one with boot verbose enabled shortly.
Comment 8 Wei Hu 2023-03-24 14:13:51 UTC
Created attachment 241087 [details]
Console output, 3/23 CURRENT, Azure Standard D48ps v5, boot verbose with show all rman

This is the one with verbose boot.
Comment 9 Mark Millard 2023-03-24 17:52:03 UTC
(In reply to Wei Hu from comment #8)

Looks to me like the following might be being classifieed as
conflicting via the overlapping ranges:

(gic0: <ARM Generic Interrupt Controller v3.0> iomem)
0xffff0000           -0x10000ffff
          0x100000000-           0xfc0000000
(ram0: reserving memory region:)

(Not that I'm expert at these issues.)
Comment 10 Mitchell Horne freebsd_committer freebsd_triage 2023-03-24 18:41:42 UTC
(In reply to Wei Hu from comment #8)

Thank you very much. I believe I was able to identify the problem based on the overlapping ranges of gic0 and physmem.

I posted a review to address this. Please test the change if you are able, and let me know if it solves the problem.

https://reviews.freebsd.org/D39260

Mitchell
Comment 11 Wei Hu 2023-03-28 07:21:21 UTC
(In reply to Mitchell Horne from comment #10)
I tested this patch on the same VM in Azure. It fixes the problem. Thanks for fixing it.
Comment 12 commit-hook freebsd_committer freebsd_triage 2023-03-31 16:28:31 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=3462c371c2562a8144f4245f9967df99874e505f

commit 3462c371c2562a8144f4245f9967df99874e505f
Author:     Mitchell Horne <mhorne@FreeBSD.org>
AuthorDate: 2023-03-31 15:32:39 +0000
Commit:     Mitchell Horne <mhorne@FreeBSD.org>
CommitDate: 2023-03-31 16:26:22 +0000

    arm64/gicv3: correct the size of the distributor resource

    Use the GICD_SIZE macro (0x10000), which is half the size of the current
    fixed-sized mapping (128 * 1024 == 0x20000).

    In ARM64 Hyper-V instances, it seems the Distributor's registers are
    located immediately preceding a range of physical memory in the bus
    address space. Thus, when ram0 is attaching and attempts to reserve
    SYS_RES_MEMORY resources corresponding to its physmem ranges, it fails,
    because the first 0x10000 bytes of this range are already owned by gic0.

    PR:             270415
    Reported by:    whu
    Tested by:      whu
    Differential Revision:  https://reviews.freebsd.org/D39260

 sys/arm64/arm64/gic_v3_acpi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Comment 13 commit-hook freebsd_committer freebsd_triage 2023-04-12 17:10:53 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=98ee3bb87a7a46b2ecae96159258632b8f6f3520

commit 98ee3bb87a7a46b2ecae96159258632b8f6f3520
Author:     Mitchell Horne <mhorne@FreeBSD.org>
AuthorDate: 2023-03-31 15:32:39 +0000
Commit:     Mitchell Horne <mhorne@FreeBSD.org>
CommitDate: 2023-04-12 17:08:38 +0000

    arm64/gicv3: correct the size of the distributor resource

    Use the GICD_SIZE macro (0x10000), which is half the size of the current
    fixed-sized mapping (128 * 1024 == 0x20000).

    In ARM64 Hyper-V instances, it seems the Distributor's registers are
    located immediately preceding a range of physical memory in the bus
    address space. Thus, when ram0 is attaching and attempts to reserve
    SYS_RES_MEMORY resources corresponding to its physmem ranges, it fails,
    because the first 0x10000 bytes of this range are already owned by gic0.

    PR:             270415
    Reported by:    whu
    Tested by:      whu
    Differential Revision:  https://reviews.freebsd.org/D39260

    (cherry picked from commit 3462c371c2562a8144f4245f9967df99874e505f)

 sys/arm64/arm64/gic_v3_acpi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)