Splitting off from PR 274237, because I haven't actually created a PR for this previously, but it'd be nice to track. With at least some ARM machines, it's possible to get stuck in a nice loop in xhci attach because the VM bits don't handle some class of requests that cannot be satisfied very well. In particular, consider this system: Physical memory chunk(s): 0x000008010a8000 - 0x00000802313fff, 19316736 bytes (1179 pages) 0x000008023d8000 - 0x0000080389bfff, 21774336 bytes (1329 pages) 0x000008038b8000 - 0x00000808f97fff, 91095040 bytes (5560 pages) 0x00000808fb8000 - 0x0000080ba03fff, 44351488 bytes (2707 pages) 0x0000080c12c000 - 0x000009d036ffff, 7585677312 bytes (462993 pages) 0x000009d4f68000 - 0x000009db93bfff, 110968832 bytes (6773 pages) 0x000009db944000 - 0x000009e096ffff, 84066304 bytes (5131 pages) 0x000009e0980000 - 0x000009e0a37fff, 753664 bytes (46 pages) avail memory = 7955300352 (7586 MB) Note that there's absolutely no RAM in the lower 4G of the address space. There's an XHCI controller that can only do 32-bit DMA (allegedly) and it has an associated IOMMU that isn't hooked up just yet. Right now, busdma will request some pages below 4G (IIRC, it's with kmem_alloc_contig here[0]), but that request cannot be satisfied -- there's absolutely no memory there. Instead, it ends up hanging in the VM layer trying to fulfill an allocation that isn't physically possible. I think it'd be better to fail the request and let busdma kick back an ENOMEM. The XHCI controller will not be functional, but that's both expected and not a deal-breaker for getting the machine into a usable state. [0] https://cgit.freebsd.org/src/tree/sys/arm64/arm64/busdma_bounce.c#n572
Where exactly is the contig alloc attempt hanging? Based on code inspection I might guess the vm_wait_domain() call from kmem_alloc_contig_pages(), but it'd be good to get a backtrace.
(In reply to Jason A. Harmening from comment #1) IIRC from the last time I debugged this, we actually get stuck just inside @ kmem_alloc_contig_domainset. kmem_alloc_contig_domain() does fail, but it's an M_WAITOK allocation so vm_domainset_iter_policy() just keeps restarting the search and we never break out. There's currently no way for, e.g., kmem_alloc_contig_domain() -> kmem_alloc_contig_pages() -> vm_page_alloc_contig_domain() to differentiate between a transient failure condition and an impossible request.
(In reply to Kyle Evans from comment #2) That makes more sense actually. The reclaim wait in kmem_alloc_contig_pages() might block unnecessarily for some time, but probably not indefinitely. Do you mind if I take this one? I've been wanting to get more familiar with the various bits of the VM subsystem, and this seems like as good a place to start as any.
(In reply to Jason A. Harmening from comment #3) > Do you mind if I take this one? I've been wanting to get more familiar with the various bits of the VM subsystem, and this seems like as good a place to start as any. Feel free... I'm too far in the weeds on many other projects to stop and take a look, though I'm more than happy to try patches or probe around a bit more on this other affected system that I have if it'd help.
Created attachment 245664 [details] Avoid page waits or rescans of domains that can't satisfy an allocation request Here's a somewhat naive first take on the problem; in local testing it eliminates the hang for a kmod rigged to attempt an impossible contigmalloc. Can you test it out on -current?
ping - Kyle, will you be able to test this sometime soon-ish?
(In reply to Jason A. Harmening from comment #6) Sorry, missed the first e-mail... I'll get my m1 branch updated and take it for a test spin sometime this week. Thanks!
(In reply to Jason A. Harmening from comment #6) Yup, that's a major quality of life improvement: snps_dwc3_fdt0: <Synopsys Designware DWC3> mem 0x382280000-0x38237ffff irq 51 on simplebus0 snps_dwc3_fdt0: SNPS Version: DWC3.1 (3331 3139302a 736f3035) snps_dwc3_fdt0: enabling power domain snps_dwc3_fdt0: 64 bytes context size, 32-bit DMA snps_dwc3_fdt0: Failed to init XHCI, with error 12 device_attach: snps_dwc3_fdt0 attach returned 6 simplebus0: <iommu@382f00000> mem 0x382f00000-0x382f03fff irq 52 compat apple,t8103-dart (no driver attached) simplebus0: <iommu@382f80000> mem 0x382f80000-0x382f83fff irq 53 compat apple,t8103-dart (no driver attached) snps_dwc3_fdt0: <Synopsys Designware DWC3> mem 0x502280000-0x50237ffff irq 54 on simplebus0 snps_dwc3_fdt0: SNPS Version: DWC3.1 (3331 3139302a 736f3035) snps_dwc3_fdt0: enabling power domain snps_dwc3_fdt0: 64 bytes context size, 32-bit DMA snps_dwc3_fdt0: Failed to init XHCI, with error 12 device_attach: snps_dwc3_fdt0 attach returned 6 <boot continues> Thanks! The patch looks generally sane, though I only gave it a cursory read-through.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=2619c5ccfe1f7889f0241916bd17d06340142b05 commit 2619c5ccfe1f7889f0241916bd17d06340142b05 Author: Jason A. Harmening <jah@FreeBSD.org> AuthorDate: 2023-11-20 23:23:58 +0000 Commit: Jason A. Harmening <jah@FreeBSD.org> CommitDate: 2023-12-24 05:01:40 +0000 Avoid waiting on physical allocations that can't possibly be satisfied - Change vm_page_reclaim_contig[_domain] to return an errno instead of a boolean. 0 indicates a successful reclaim, ENOMEM indicates lack of available memory to reclaim, with any other error (currently only ERANGE) indicating that reclamation is impossible for the specified address range. Change all callers to only follow up with vm_page_wait* in the ENOMEM case. - Introduce vm_domainset_iter_ignore(), which marks the specified domain as unavailable for further use by the iterator. Use this function to ignore domains that can't possibly satisfy a physical allocation request. Since WAITOK allocations run the iterators repeatedly, this avoids the possibility of infinitely spinning in domain iteration if no available domain can satisfy the allocation request. PR: 274252 Reported by: kevans Tested by: kevans Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D42706 sys/arm/nvidia/drm2/tegra_bo.c | 9 +++-- sys/compat/linuxkpi/common/src/linux_page.c | 8 ++-- sys/dev/drm2/ttm/ttm_bo.c | 4 +- sys/dev/drm2/ttm/ttm_page_alloc.c | 9 +++-- sys/kern/uipc_ktls.c | 5 ++- sys/kern/uipc_shm.c | 5 ++- sys/vm/vm_domainset.c | 32 +++++++++++++--- sys/vm/vm_domainset.h | 2 + sys/vm/vm_kern.c | 24 +++++++++++- sys/vm/vm_page.c | 58 ++++++++++++++++++++++------- sys/vm/vm_page.h | 6 +-- 11 files changed, 123 insertions(+), 39 deletions(-)