Summary: | Raspberry Pi 4B does not reboot after addition of ram0 driver | ||
---|---|---|---|
Product: | Base System | Reporter: | Mike Karels <karels> |
Component: | arm | Assignee: | Mitchell Horne <mhorne> |
Status: | Closed FIXED | ||
Severity: | Affects Some People | CC: | kevans, marklmi26-fbsd, mhorne |
Priority: | --- | ||
Version: | CURRENT | ||
Hardware: | arm64 | ||
OS: | Any | ||
URL: | https://reviews.freebsd.org/D39003 |
Description
Mike Karels
2023-03-08 16:12:44 UTC
This is wrong but I don't think will break anything: > 0x3ee5c000 - 0x3ee5cfff, 0 MB ( 1 pages) NoAlloc NoDump > 0x3ee5c000 - 0x3ee5cfff, 0 MB ( 1 pages) NoAlloc I guess this means we grabbed exclusions both from the memory map and FDT, but we didn't coalesce them because the flags didn't match exactly. This patch should fix that rather than avoiding processing one or the other, in case we end up with slightly more information in one than the other: https://people.freebsd.org/~kevans/physmem.diff -- I don't think this would break resetting off-hand, though. One of the things I'm unclear about for what this driver does is the distinction between, for example, PCIe addresses that are local to the PCIe device or segment (right word?) that do not share an address space that contains the system RAM. By contrast there are the inbound PCIe addresses that do share the address space that also contains the system RAM. The devinfo -u output (for example) does not make the distinction up front but shows examples of both. For example, using a RPi4B for illustration: PCIe Memory: 0xc0000000-0xc00fffff (pcib1) 0xc0100000-0xffffffff (root0) . . . pcib1 memory window: 0x600000000-0x600000fff (bcm_xhci0) 0xc0001000-0xc00fffff (root0) The "0x600000000-0x600000fff (bcm_xhci0)" is inbound addressing having the address space that also includes system RAM. The other 3 do not. For these 3 I do not see how the lack of a conflicts with ram0 is supposed to be handled. Another issue is that there is no "verbose shutdown" to be able to get evidence about where the hangup during shutdown happens or what low level activity FreeBSD is trying to do at the time. (In reply to Kyle Evans from comment #1) I tried your patch; it doesn't change reboot, but changes the Excluded map: *** Excluded.standard Wed Mar 8 13:33:58 2023 --- Excluded.patch Wed Mar 8 13:34:28 2023 *************** *** 1,13 **** Excluded memory regions: 0x00000000 - 0x00001fff, 0 MB ( 2 pages) NoAlloc 0x07ef0000 - 0x07f0ffff, 0 MB ( 32 pages) NoAlloc ! 0x32000000 - 0x335b3fff, 21 MB ( 5556 pages) NoAlloc 0x39f2a000 - 0x39f2afff, 0 MB ( 1 pages) NoAlloc 0x39f2e000 - 0x39f2efff, 0 MB ( 1 pages) NoAlloc 0x39f30000 - 0x39f31fff, 0 MB ( 2 pages) NoAlloc 0x39f33000 - 0x39f36fff, 0 MB ( 4 pages) NoAlloc 0x3b350000 - 0x3b35ffff, 0 MB ( 16 pages) NoAlloc 0x3eaf5000 - 0x3ebf7fff, 1 MB ( 259 pages) NoAlloc 0x3ef5b000 - 0x3ef5bfff, 0 MB ( 1 pages) NoAlloc NoDump - 0x3ef5b000 - 0x3ef5bfff, 0 MB ( 1 pages) NoAlloc 0xfe100000 - 0xfe100fff, 0 MB ( 1 pages) NoAlloc --- 1,12 ---- Excluded memory regions: 0x00000000 - 0x00001fff, 0 MB ( 2 pages) NoAlloc 0x07ef0000 - 0x07f0ffff, 0 MB ( 32 pages) NoAlloc ! 0x32000000 - 0x335b2fff, 21 MB ( 5555 pages) NoAlloc 0x39f2a000 - 0x39f2afff, 0 MB ( 1 pages) NoAlloc 0x39f2e000 - 0x39f2efff, 0 MB ( 1 pages) NoAlloc 0x39f30000 - 0x39f31fff, 0 MB ( 2 pages) NoAlloc 0x39f33000 - 0x39f36fff, 0 MB ( 4 pages) NoAlloc 0x3b350000 - 0x3b35ffff, 0 MB ( 16 pages) NoAlloc 0x3eaf5000 - 0x3ebf7fff, 1 MB ( 259 pages) NoAlloc 0x3ef5b000 - 0x3ef5bfff, 0 MB ( 1 pages) NoAlloc NoDump 0xfe100000 - 0xfe100fff, 0 MB ( 1 pages) NoAlloc This may be a different vintage RPi4; it's the original version with 4 GB. Hi, thanks for the report. I did not see the ML post, despite being CC'd.
On the RPI4B, reboot is handled by the BCM watchdog device. Looks like ram0 is preventing the bcmwd0 device from attaching
(dmesg output from my own RPI4):
> bcmwd0: <BCM2708/2835 Watchdog> mem 0x7e100000-0x7e100113,0x7e00a000-0x7e00a023,0x7ec11000-0x7ec1101f on simplebus0
> bcmwd0: could not allocate memory resource
> device_attach: bcmwd0 attach returned 6
Clearly, we need to be slightly more selective in identifying excluded ranges that correspond to physical memory rather than just bus space. This may be a shortcoming in the discovery mechanism (device tree). Ultimately this is somewhat expected fallout from e6cf1a0826c9, and I am looking into it.
(In reply to Kyle Evans from comment #1) Evidently that wasn't the problem at hand, but this fix is still good, IMO. Consider it Reviewed by: mhorne. (In reply to Kyle Evans from comment #1) When looking at the upstream status of the U-Boot 2023.01 break to some contexts, including 8 GiByte RPi4B/CM4 contexts, I found: lmb: Treat a region which is a subset as equal : https://source.denx.de/u-boot/u-boot/-/commit/0d91c88230fe8bd9f8d39ca2ab69cd6282e9f949 Bump LMB_MAX_REGIONS default to 16 : https://source.denx.de/u-boot/u-boot/-/commit/2dc16a2c1f924985216b3f1d6710f96d6c4fb1ab are the commits for the issue. That first seems to have fixed U-Boot 2023.01 causing extra, overlapping ranges. From what I can tell, the fix activity has settled on these updates. It may be important that replication efforts for testing sometimes use the original, broken U-Boot 2023.01 . (In reply to Mitchell Horne from comment #4) Looks to me like the: bcmwd0: <BCM2708/2835 Watchdog> mem 0x7e100000-0x7e100113,0x7e00a000-0x7e00a023,0x7ec11000-0x7ec1101f on simplebus0 overlaps with the UEFI page described with: MemoryMappedIO 0000fe100000 0000fe100000 00000001 RUNTIME and it is in the address space that includes the system RAM. My guess is that the MemoryMappedIO status is an indication that using those addresses is okay (for the purpose of the MemoryMappedIO region exists for): not an actual conflict for the purpose but the page starting at 0000fe100000 is not okay for general-use memory allocations. So, not the same as what I was curious about in Comment #2 . (In reply to Mark Millard from comment #7) For (4 GiByte RPi4B): Physical memory chunk(s): 0x00002000 - 0x3b2fffff, 946 MB ( 242430 pages) 0x40000000 - 0xfbffffff, 3008 MB ( 770048 pages) should (sub)ranges outside of that never be assigned to ram0?: already not in available RAM subrange? For example, the page: MemoryMappedIO 0000fe100000 0000fe100000 00000001 RUNTIME is outside the Physical-chunks-of-RAM ranges. In other words, no part of: ram0: reserving excluded region: fe100000-fe100fff should have happened --because of the complete lack of overlap with the Physical-chunks-of-RAM ranges? The same would go for "C0T" 8GiByte RPi4B. There is no overlap for: Physical memory chunk(s): 0x00002000 - 0x3b2fffff, 946 MB ( 242430 pages) 0x40000000 - 0xfbffffff, 3008 MB ( 770048 pages) 0x100000000 - 0x1ffffffff, 4096 MB (1048576 pages) vs. MemoryMappedIO 0000fe100000 0000fe100000 00000001 RUNTIME The same would go for "B0T" 8GiByte RPi4B. There is no overlap for (same Physical-chunks-of-RAM as "C0T"): Physical memory chunk(s): 0x00002000 - 0x3b2fffff, 946 MB ( 242430 pages) 0x40000000 - 0xfbffffff, 3008 MB ( 770048 pages) 0x100000000 - 0x1ffffffff, 4096 MB (1048576 pages) vs. MemoryMappedIO 0000fe100000 0000fe100000 00000001 RUNTIME Looks like physmem_excluded is never used: # grep -r 'physmem_excluded' /usr/main-src/sys/ | less /usr/main-src/sys/sys/physmem.h:bool physmem_excluded(vm_paddr_t pa, vm_size_t sz); /usr/main-src/sys/kern/subr_physmem.c:physmem_excluded(vm_paddr_t pa, vm_size_t sz) (And, looking at the code, the testing used does not deal with anything but fully-contained-in as "excluded", not even reporting that such an oddity as a partial overlap was present if it was). A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=8937bd37d07c5c75995e01457aec00fb0a05c462 commit 8937bd37d07c5c75995e01457aec00fb0a05c462 Author: Mitchell Horne <mhorne@FreeBSD.org> AuthorDate: 2023-03-15 15:26:57 +0000 Commit: Mitchell Horne <mhorne@FreeBSD.org> CommitDate: 2023-03-15 15:28:35 +0000 arm64: limit EFI excluded regions to physical memory types Consolidate add_efi_map_entry() and exclude_efi_map_entry() into a single function, handle_efi_map_entry(), so that the exact set of entry types handled is the same in the addition or exclusion cases. Before, exclude_efi_map_entry() had a 'default' case that would exclude all entry types that were not listed explicitly in the switch statement. Logically, we do not need to exclude a range that could not possibly be added to physmem, and we do not need to exclude bus ranges that are not physical memory, for example EFI_MD_TYPE_IOMEM. Since physmem's ram0 device will reserve bus memory resources for its owned ranges, this was preventing attachment of the watchdog device on the RPI4B. For some reason its region of memory-mapped I/O appeared in the EFI memory map (with the aforementioned EFI_MD_TYPE_IOMEM type). This change fixes the attachment issue, as we prevent the physmem API from messing with this range of bus space. PR: 270044 Reported by: karels, Mark Millard Reviewed by: andrew, karels, imp MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D39003 sys/arm64/arm64/machdep.c | 71 ++++++++++++++++++++++++++--------------------- 1 file changed, 39 insertions(+), 32 deletions(-) A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=865b5640f8ef3c3a369f13b2c531737b7fb8fec4 commit 865b5640f8ef3c3a369f13b2c531737b7fb8fec4 Author: Mitchell Horne <mhorne@FreeBSD.org> AuthorDate: 2023-03-15 15:26:57 +0000 Commit: Mitchell Horne <mhorne@FreeBSD.org> CommitDate: 2023-03-24 13:12:40 +0000 arm64: limit EFI excluded regions to physical memory types Consolidate add_efi_map_entry() and exclude_efi_map_entry() into a single function, handle_efi_map_entry(), so that the exact set of entry types handled is the same in the addition or exclusion cases. Before, exclude_efi_map_entry() had a 'default' case that would exclude all entry types that were not listed explicitly in the switch statement. Logically, we do not need to exclude a range that could not possibly be added to physmem, and we do not need to exclude bus ranges that are not physical memory, for example EFI_MD_TYPE_IOMEM. Since physmem's ram0 device will reserve bus memory resources for its owned ranges, this was preventing attachment of the watchdog device on the RPI4B. For some reason its region of memory-mapped I/O appeared in the EFI memory map (with the aforementioned EFI_MD_TYPE_IOMEM type). This change fixes the attachment issue, as we prevent the physmem API from messing with this range of bus space. PR: 270044 Reported by: karels, Mark Millard Reviewed by: andrew, karels, imp MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D39003 (cherry picked from commit 8937bd37d07c5c75995e01457aec00fb0a05c462) sys/arm64/arm64/machdep.c | 71 ++++++++++++++++++++++++++--------------------- 1 file changed, 39 insertions(+), 32 deletions(-) |