When loading a large kernel module (in this case nvidia.ko) a massive amount of memory will be fired. This ends up wiring 4.2Gb out of 8Gb on the system, which significantly hurts usability. There was a lot of discussion on discord about this, I'll try to include it here: https://discord.com/channels/727023752348434432/757305573866733680/992940224063746058 "Maybe it is the addr param itself after all. I'm seeing some weirdness in the min address used for kernel_vm_end that I haven't explained yet. KERNBASE is 0xffffffff80000000, but I see vm_map_find using 0xfffffe0000000000 as the min address to start searching for open space at. So kernel_vm_end appears to start at fffffe00b1600000 when loading the kernel (the kernel log drops some lines there so this is the first value I see for it), and it's raised to find more space. That seems to be the general "neighborhood" of addresses passed to pmap_growkernel . Then at some point when I call kldload it passes some offset from KERNBASE instead of from that "neighborhood", and since KERNBASE and 0xfffffe0000000000 are so far away it allocates a million pages. Sorry the logic there is a little fuzzy, but tldr is it feels like there are two different ideas about where the kernel VM address starts and kldload triggers us using the wrong one." What we think is happening is that normally vm_map_find is called starting from VM_MIN_KERNEL_ADDRESS, but link_elf_load_file starts searching at KERNBASE. This ends up wiring massive amount of memory to split the difference. Normally this would be fine since we reserve a bunch of pages after KERNBASE for kernel modules, but if the kernel module is too large and overflows this region, this error might occur. This PR tracks improving the logic of pmap_growkernel to properly track growing in multiple ways. We might want to have two kernel_vm_ends, and decide the starting/ending points of growing the kernel based on the address given to pmap_growkernel and if it's above KERNBASE
I think this patch will fix it: https://reviews.freebsd.org/D36673
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=0b29f5efcc7ee8271ad2f6b6447898b489d618ec commit 0b29f5efcc7ee8271ad2f6b6447898b489d618ec Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2022-09-24 13:19:21 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2022-09-24 13:27:50 +0000 amd64: Make it possible to grow the KERNBASE region of KVA pmap_growkernel() may be called when mapping a region above KERNBASE, typically for a kernel module. If we have enough PTPs left over from bootstrap, pmap_growkernel() does nothing. However, it's possible to run out, and in this case pmap_growkernel() will try to grow the kernel map all the way from kernel_vm_end to somewhere past KERNBASE, which can easily run the system out of memory. This happens with large kernel modules such as the nvidia GPU driver. There is also a WIP dtrace provider which needs to map KVA in the region above KERNBASE (to provide trampolines which allow a copy of traced kernel instruction to be executed), and its allocations could potentially trigger this scenario. This change modifies pmap_growkernel() to manage the two regions separately, allowing them to grow independently. The end of the KERNBASE region is tracked by modifying "nkpt". PR: 265019 Reviewed by: alc, imp, kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D36673 sys/amd64/amd64/pmap.c | 68 +++++++++++++++++++++++++++++++++----------------- 1 file changed, 45 insertions(+), 23 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=8bebdbe494f6909221e324ec5c13700dfd30cb5e commit 8bebdbe494f6909221e324ec5c13700dfd30cb5e Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2022-09-24 13:19:21 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2022-10-09 15:21:10 +0000 amd64: Make it possible to grow the KERNBASE region of KVA pmap_growkernel() may be called when mapping a region above KERNBASE, typically for a kernel module. If we have enough PTPs left over from bootstrap, pmap_growkernel() does nothing. However, it's possible to run out, and in this case pmap_growkernel() will try to grow the kernel map all the way from kernel_vm_end to somewhere past KERNBASE, which can easily run the system out of memory. This happens with large kernel modules such as the nvidia GPU driver. There is also a WIP dtrace provider which needs to map KVA in the region above KERNBASE (to provide trampolines which allow a copy of traced kernel instruction to be executed), and its allocations could potentially trigger this scenario. This change modifies pmap_growkernel() to manage the two regions separately, allowing them to grow independently. The end of the KERNBASE region is tracked by modifying "nkpt". PR: 265019 Reviewed by: alc, imp, kib (cherry picked from commit 0b29f5efcc7ee8271ad2f6b6447898b489d618ec) sys/amd64/amd64/pmap.c | 65 ++++++++++++++++++++++++++++++++++---------------- 1 file changed, 44 insertions(+), 21 deletions(-)
Sorry, took a bit of time to circle back and test this. Can confirm it fixed the issue. Thanks!
(In reply to Austin Shafer from comment #4) No problem, thanks for testing.