Created attachment 249802 [details] bug fix: support non-contiguous memory in PHYS_IN_DMAP and VIRT_IN_DMAP The issue was observed on r8g.metal-48xl which is a coherent Graviton4-based system with 2 SoCs. ENA driver crashes when trying to remap BAR2 memory with WC attributes: ena0: Elastic Network Adapter (ENA)ena v2.7.0 panic: Invalid DMAP table level: 0 cpuid = 0 time = 1 KDB: stack backtrace: db_trace_self() at db_trace_self db_trace_self_wrapper() at db_trace_self_wrapper+0x38 vpanic() at vpanic+0x1a4 panic() at panic+0x48 pmap_change_props_locked() at pmap_change_props_locked+0x6dc pmap_change_props_locked() at pmap_change_props_locked+0x550 pmap_change_attr() at pmap_change_attr+0x5c ena_attach() at ena_attach+0x264 device_attach() at device_attach+0x3fc device_probe_and_attach() at device_probe_and_attach+0x80 bus_generic_attach() at bus_generic_attach+0x1c pci_attach() at pci_attach+0xec ... The memory map on those systems has 2 large DRAM regions, one for each SoC. The regions are non contiguous and IO memory that includes the PCI BARs for the 2nd SoC is between those 2 regions. So the memory map looks like following DRAM region 0: 0x10000000000-0x1c000000000 IO memory(SoC1): 0x41000000000-0x050000000000 DRAM region 1: 0x50000000000-0x5c000000000 PHYS_IN_DMAP and VIRT_IN_DMAP macros in arm64 vmparam.h that inherently assume that the DMAP region is contiguous with current implementation: /* True if pa is in the dmap range */ #define PHYS_IN_DMAP(pa) ((pa) >= DMAP_MIN_PHYSADDR && \ (pa) < DMAP_MAX_PHYSADDR) /* True if va is in the dmap range */ #define VIRT_IN_DMAP(va) ((va) >= DMAP_MIN_ADDRESS && \ (va) < (dmap_max_addr)) This causes the following check in sys/arm64/arm64/pmap.c, pmap_change_props_locked() to wrongly conclude that PCI memory is part of DMAP: if (!VIRT_IN_DMAP(tmpva) && PHYS_IN_DMAP(pa)) { /* * Keep the DMAP memory in sync. */ rv = pmap_change_props_locked( PHYS_TO_DMAP(pa), pte_size, prot, mode, true); if (rv != 0) return (rv); } And consequent lookup of this memory in DMAP page tables fails, as PCI memory is not mapped in DMAP. Attached a proposed patch that stores a list of DMAP regions and changes the macros above to functions that check the address vs the list. (Based on https://reviews.freebsd.org/D3538) The instance boots properly after applying this patch.
Created attachment 249803 [details] bug fix: support non-contiguous memory in PHYS_IN_DMAP and VIRT_IN_DMAP
I have a possible fix in review D44677
Thank Andrew! Zeev, can you test Andrew's patch? I can build an AMI with it if it would help, but I don't have access to r8g.metal-48xl so I can't test it myself.
The system boots properly with Andrew's patch, thanks! Should VIRT_IN_DMAP macro be fixed too? The patch addresses only PHYS_IN_DMAP.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=9d40492efa467095340cf3dca5860880aa441472 commit 9d40492efa467095340cf3dca5860880aa441472 Author: Andrew Turner <andrew@FreeBSD.org> AuthorDate: 2024-04-08 10:44:33 +0000 Commit: Andrew Turner <andrew@FreeBSD.org> CommitDate: 2024-04-24 18:17:19 +0000 arm64: Check DMAP address is valid in PHYS_IN_DMAP When checking if a physical address is in the DMAP region we assume all physical addresses between DMAP_MIN_PHYSADDR and DMAP_MAX_PHYSADDR are able to be accesses through the DMAP. It may be the case that there is device memory in this range that shouldn't be accessed through the DMAP mappings. Add a check to PHYS_IN_DMAP that the translated virtual address is a valid kernel address. To support code that already checks the address is valid add PHYS_IN_DMAP_RANGE. PR: 278233 Reviewed by: alc, markj Sponsored by: Arm Ltd Differential Revision: https://reviews.freebsd.org/D44677 sys/arm64/arm64/efirt_machdep.c | 9 ++------- sys/arm64/arm64/machdep.c | 2 +- sys/arm64/arm64/minidump_machdep.c | 7 ++++--- sys/arm64/include/vmparam.h | 18 +++++++++++++++--- 4 files changed, 22 insertions(+), 14 deletions(-)
A commit in branch stable/14 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=c3a3b231da00e93fcd7baced74fa933b112de473 commit c3a3b231da00e93fcd7baced74fa933b112de473 Author: Andrew Turner <andrew@FreeBSD.org> AuthorDate: 2024-04-08 10:44:33 +0000 Commit: Andrew Turner <andrew@FreeBSD.org> CommitDate: 2024-05-02 07:59:31 +0000 arm64: Check DMAP address is valid in PHYS_IN_DMAP When checking if a physical address is in the DMAP region we assume all physical addresses between DMAP_MIN_PHYSADDR and DMAP_MAX_PHYSADDR are able to be accesses through the DMAP. It may be the case that there is device memory in this range that shouldn't be accessed through the DMAP mappings. Add a check to PHYS_IN_DMAP that the translated virtual address is a valid kernel address. To support code that already checks the address is valid add PHYS_IN_DMAP_RANGE. PR: 278233 Reviewed by: alc, markj Sponsored by: Arm Ltd Differential Revision: https://reviews.freebsd.org/D44677 (cherry picked from commit 9d40492efa467095340cf3dca5860880aa441472) sys/arm64/arm64/efirt_machdep.c | 9 ++------- sys/arm64/arm64/machdep.c | 2 +- sys/arm64/arm64/minidump_machdep.c | 7 ++++--- sys/arm64/include/vmparam.h | 18 +++++++++++++++--- 4 files changed, 22 insertions(+), 14 deletions(-)
(In reply to Zeev Zilberman from comment #4) Can you confirm that 14.1-BETA1 boots on these instances? (If it doesn't, we need to fix whatever other problems exist ASAP before the release.)
Confirming that FreeBSD 14.1-BETA2 boots properly on r8g.metal-48xl system.
Thanks!