Summary: | sys/vm: less-than-ideal handling of memory requests that cannot be fulfilled | ||||||
---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Kyle Evans <kevans> | ||||
Component: | kern | Assignee: | Jason A. Harmening <jah> | ||||
Status: | Closed FIXED | ||||||
Severity: | Affects Only Me | CC: | emaste, jah | ||||
Priority: | --- | ||||||
Version: | Unspecified | ||||||
Hardware: | Any | ||||||
OS: | Any | ||||||
Attachments: |
|
Description
Kyle Evans
2023-10-03 20:28:34 UTC
Where exactly is the contig alloc attempt hanging? Based on code inspection I might guess the vm_wait_domain() call from kmem_alloc_contig_pages(), but it'd be good to get a backtrace. (In reply to Jason A. Harmening from comment #1) IIRC from the last time I debugged this, we actually get stuck just inside @ kmem_alloc_contig_domainset. kmem_alloc_contig_domain() does fail, but it's an M_WAITOK allocation so vm_domainset_iter_policy() just keeps restarting the search and we never break out. There's currently no way for, e.g., kmem_alloc_contig_domain() -> kmem_alloc_contig_pages() -> vm_page_alloc_contig_domain() to differentiate between a transient failure condition and an impossible request. (In reply to Kyle Evans from comment #2) That makes more sense actually. The reclaim wait in kmem_alloc_contig_pages() might block unnecessarily for some time, but probably not indefinitely. Do you mind if I take this one? I've been wanting to get more familiar with the various bits of the VM subsystem, and this seems like as good a place to start as any. (In reply to Jason A. Harmening from comment #3) > Do you mind if I take this one? I've been wanting to get more familiar with the various bits of the VM subsystem, and this seems like as good a place to start as any. Feel free... I'm too far in the weeds on many other projects to stop and take a look, though I'm more than happy to try patches or probe around a bit more on this other affected system that I have if it'd help. Created attachment 245664 [details]
Avoid page waits or rescans of domains that can't satisfy an allocation request
Here's a somewhat naive first take on the problem; in local testing it eliminates the hang for a kmod rigged to attempt an impossible contigmalloc. Can you test it out on -current?
ping - Kyle, will you be able to test this sometime soon-ish? (In reply to Jason A. Harmening from comment #6) Sorry, missed the first e-mail... I'll get my m1 branch updated and take it for a test spin sometime this week. Thanks! (In reply to Jason A. Harmening from comment #6) Yup, that's a major quality of life improvement: snps_dwc3_fdt0: <Synopsys Designware DWC3> mem 0x382280000-0x38237ffff irq 51 on simplebus0 snps_dwc3_fdt0: SNPS Version: DWC3.1 (3331 3139302a 736f3035) snps_dwc3_fdt0: enabling power domain snps_dwc3_fdt0: 64 bytes context size, 32-bit DMA snps_dwc3_fdt0: Failed to init XHCI, with error 12 device_attach: snps_dwc3_fdt0 attach returned 6 simplebus0: <iommu@382f00000> mem 0x382f00000-0x382f03fff irq 52 compat apple,t8103-dart (no driver attached) simplebus0: <iommu@382f80000> mem 0x382f80000-0x382f83fff irq 53 compat apple,t8103-dart (no driver attached) snps_dwc3_fdt0: <Synopsys Designware DWC3> mem 0x502280000-0x50237ffff irq 54 on simplebus0 snps_dwc3_fdt0: SNPS Version: DWC3.1 (3331 3139302a 736f3035) snps_dwc3_fdt0: enabling power domain snps_dwc3_fdt0: 64 bytes context size, 32-bit DMA snps_dwc3_fdt0: Failed to init XHCI, with error 12 device_attach: snps_dwc3_fdt0 attach returned 6 <boot continues> Thanks! The patch looks generally sane, though I only gave it a cursory read-through. A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=2619c5ccfe1f7889f0241916bd17d06340142b05 commit 2619c5ccfe1f7889f0241916bd17d06340142b05 Author: Jason A. Harmening <jah@FreeBSD.org> AuthorDate: 2023-11-20 23:23:58 +0000 Commit: Jason A. Harmening <jah@FreeBSD.org> CommitDate: 2023-12-24 05:01:40 +0000 Avoid waiting on physical allocations that can't possibly be satisfied - Change vm_page_reclaim_contig[_domain] to return an errno instead of a boolean. 0 indicates a successful reclaim, ENOMEM indicates lack of available memory to reclaim, with any other error (currently only ERANGE) indicating that reclamation is impossible for the specified address range. Change all callers to only follow up with vm_page_wait* in the ENOMEM case. - Introduce vm_domainset_iter_ignore(), which marks the specified domain as unavailable for further use by the iterator. Use this function to ignore domains that can't possibly satisfy a physical allocation request. Since WAITOK allocations run the iterators repeatedly, this avoids the possibility of infinitely spinning in domain iteration if no available domain can satisfy the allocation request. PR: 274252 Reported by: kevans Tested by: kevans Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D42706 sys/arm/nvidia/drm2/tegra_bo.c | 9 +++-- sys/compat/linuxkpi/common/src/linux_page.c | 8 ++-- sys/dev/drm2/ttm/ttm_bo.c | 4 +- sys/dev/drm2/ttm/ttm_page_alloc.c | 9 +++-- sys/kern/uipc_ktls.c | 5 ++- sys/kern/uipc_shm.c | 5 ++- sys/vm/vm_domainset.c | 32 +++++++++++++--- sys/vm/vm_domainset.h | 2 + sys/vm/vm_kern.c | 24 +++++++++++- sys/vm/vm_page.c | 58 ++++++++++++++++++++++------- sys/vm/vm_page.h | 6 +-- 11 files changed, 123 insertions(+), 39 deletions(-) |