Not all memory can be allocated from a two domain configuration. This results in a "vm_page_alloc: missing page" panic. Details @ https://people.freebsd.org/~pho/stress/log/maxmemdom.txt
Is this related to https://lists.freebsd.org/pipermail/freebsd-arch/2015-April/017138.html ?
This is a long-standing VM issue that earlier first-touch page allocation (in freebsd-8?) would also hit. I had a local modification in my NUMA branch that handled the case of the page allocation failing. diff --git a/sys/vm/vm_page.c b/sys/vm/vm_page.c index 3b58fb7..9bd4adc 100644 --- a/sys/vm/vm_page.c +++ b/sys/vm/vm_page.c @@ -1625,6 +1625,7 @@ vm_page_alloc(vm_object_t object, vm_pindex_t pindex, int req) * vm_page_cache(). */ mtx_lock_flags(&vm_page_queue_free_mtx, MTX_RECURSE); + m = NULL; if (vm_cnt.v_free_count + vm_cnt.v_cache_count > vm_cnt.v_free_reserved || (req_class == VM_ALLOC_SYSTEM && vm_cnt.v_free_count + vm_cnt.v_cache_count > vm_cnt.v_interrupt_free_min) || @@ -1669,7 +1670,19 @@ vm_page_alloc(vm_object_t object, vm_pindex_t pindex, int req) } #endif } - } else { + } + + /* + * Can't allocate or attempted to and couldn't allocate a page + * given the current VM policy. Give up. + * + * Note - yes, this is one of the current shortcomings of the + * VM domain design - there's a global set of vm_cnt counters, + * and it's quite possible things will get unhappy with this. + * However without it'll kernel panic below - the code didn't + * check m == NULL here and would continue. + */ + if (m == NULL) { /* * Not allocatable, give up. */
I have tested the patch and this indeed removes the panic. Out of curiosity I added a "failed allocation" counter: + if (m == NULL) { /* * Not allocatable, give up. */ mtx_unlock(&vm_page_queue_free_mtx); + atomic_add_int(&pho, 1); atomic_add_int(&vm_pageout_deficit, which reached pho = 100479984 during this test scenario.
Right. So this was in here way before my numa stuff let you configure things. The problem is that the memory allocation isn't being perfectly balanced between numa domains and the VM thresholds are global. So, the VM thresholds say "there's pages", but when you go to allocate, there aren't any from the given domain. Now, the odd situation here is that the page allocation should be "first touch round robin" so it should be failing back to allocating from another domain and only returning NULL if it couldn't find anything. Can you try this on stable/10 with MAXMEMDOM set to something in your kernel config? I'd like to see if you hit the same issue.
No problems on stable/10: $ uname -a FreeBSD t1.osted.lan 10.2-STABLE FreeBSD 10.2-STABLE #0 r290387: Thu Nov 5 11:03:23 CET 2015 pho@t1.osted.lan:/usr/src/sys/amd64/compile/MAXMEMDOM amd64 $ /usr/bin/time -h ./maxmemdom.sh 8m8,92s real 0,12s user 5m10,15s sys $ /usr/bin/time -h ./maxmemdom.sh 9m19,95s real 0,23s user 5m13,58s sys $ sysctl vm.ndomains vm.ndomains: 2 $
ok. Well, there have been VM changes too between 10 and head. What's the output of "sysctl vm.default_policy" ? You can set it to "rr" (for round-robin) and retry. At that point it should be mirroring the existing default behaviour in stable/10 (which with NUMA enabled is round-robin).
ok, I bet my first-touch iterator is biting me. It doesn't skip over the first-touch domain, so it's possible that you've hit a situation where the domain 'n' fails allocation, and the per-thread round-robin domain value is also 'n'. We'll just have to fix the round-robin iterator routine to take a domain to 'skip' over (and have it ensure that it isn't just a single domain (0) and thus gets stuck skipping over that. :-) I don't have any NUMA boxes handy atm but I'll try to come up with a patch to test. Thanks! -a
Can't be In Progress without an Assignee Can we get current/proposed patches added as attachments please
I tested the patch committed as r293640. The panic reported is no longer seen.
Can not reproduce the problem any more.