I experienced this crash on riscv (while doing a makeworld -j8), but I'm told it has been observed on amd64, too. This is the panic: panic: vm_reserv_depopulate: reserv 0xffffffd3e672c560's popmap[208] is clear cpuid = 1 time = 1629858147 KDB: stack backtrace: db_trace_self() at db_trace_self db_trace_self_wrapper() at db_trace_self_wrapper+0x38 kdb_backtrace() at kdb_backtrace+0x2c vpanic() at vpanic+0x148 panic() at panic+0x2a vm_reserv_free_page() at vm_reserv_free_page+0x37c vm_page_free_prep() at vm_page_free_prep+0x168 vm_page_free_toq() at vm_page_free_toq+0x18 vm_page_free() at vm_page_free+0x18 vm_object_terminate() at vm_object_terminate+0xec vm_object_deallocate() at vm_object_deallocate+0x2a6 vm_map_process_deferred() at vm_map_process_deferred+0x9c vm_map_remove() at vm_map_remove+0xd2 vmspace_exit() at vmspace_exit+0x10e exit1() at exit1+0x4aa sys_sys_exit() at sys_sys_exit+0x10 do_trap_user() at do_trap_user+0x208 cpu_exception_handler_user() at cpu_exception_handler_user+0x72 The dump files can be fetched at https://nextcloud.towernet.ca/s/HiJ654HLWnbT6jD
Re observed on amd64, Google has 5 hits (I assume this bug report will shortly appear too) for `"vm_reserv_depopulate" "is clear"`, all but one of which are just for the code, with the final one being for this 5 year old GitHub Gist https://gist.github.com/nomadlogic/ba58e8fd01267fbf7a2fa4fcee29e2f7 that was for FreeBSD 12.0-CURRENT on amd64 when using the old freebsd-base-graphics tree. Looking at vm_reserv itself, whilst a caller can do stupid things and potentially cause crashes, my initial reading is that this KASSERT should be impossible no matter what the caller is doing.
The vmcore file on its own isn't useful without a copy of the corresponding kernel (/boot/kernel) and debug files (/usr/lib/debug/boot/kernel). It would be useful to see a dump of the vm_page and reservation in question: (kgdb) frame 14 (kgdb) p/x *m (kgdb) p/x *(vm_reserv_t)0xffffffd3e672c560
I will get on this ... but it might be tomorrow. I will run those commands _and_ I will upload those files.
(kgdb) frame 14 #14 0xffffffc0005a8da0 in vm_page_free_prep (m=0xffffffd3f1012168) at /usr/src/sys/vm/vm_page.c:3842 warning: Source file is more recent than executable. 3842 if ((m->flags & PG_PCPU_CACHE) == 0 && vm_reserv_free_page(m)) (kgdb) p/x *m $1 = {plinks = {q = {tqe_next = 0xffffffd3f10121d0, tqe_prev = 0xffffffd3f1012100}, s = {ss = { sle_next = 0xffffffd3f10121d0}}, memguard = {p = 0xffffffd3f10121d0, v = 0xffffffd3f1012100}, uma = { slab = 0xffffffd3f10121d0, zone = 0xffffffd3f1012100}}, listq = {tqe_next = 0xffffffd3f10121d0, tqe_prev = 0xffffffd3f1012110}, object = 0x0, pindex = 0x2d0, phys_addr = 0x21f2d0000, md = {pv_list = { tqh_first = 0x0, tqh_last = 0xffffffd3f10121a0}, pv_gen = 0xf, pv_memattr = 0x2}, ref_count = 0x0, busy_lock = 0xfffffffe, a = {{flags = 0x18, queue = 0x1, act_count = 0x5}, _bits = 0x5010018}, order = 0xc, pool = 0x0, flags = 0x0, oflags = 0x0, psind = 0x0, segind = 0x1, valid = 0x0, dirty = 0x0} (kgdb) p/x *(vm_reserv_t)0xffffffd3e672c560 $2 = {lock = {lock_object = {lo_name = 0xffffffc00066006c, lo_flags = 0x1030000, lo_data = 0x0, lo_witness = 0xffffffd3ffd8e180}, mtx_lock = 0xffffffc2227f7100}, partpopq = {tqe_next = 0xffffffd3e6756fe0, tqe_prev = 0xffffffd3e679c240}, objq = {le_next = 0xffffffd3e67b04a0, le_prev = 0xffffffd0a46be0c0}, object = 0xffffffd0a46be000, pindex = 0x200, pages = 0xffffffd3f100cce8, popcnt = 0xef, domain = 0x0, inpartpopq = 0x1, lasttick = 0xa9d59012, popmap = {0x0, 0x0, 0x0, 0xffffffffff000000, 0xfffffffffffe040f, 0x24fc0ffffc925927, 0xffffff0847fc9249, 0x1fffffffffffffff}}
https://termbin.com/q8g9 for that last bit. Heh... my standard window is 120 these days.
I tarred /boot/kernel and /usr/lib/debug/boot/kernel into the nextcloud directory. You can fetch them from the same place (https://nextcloud.towernet.ca/s/wPpj7zgxgDBAZ6q)
(In reply to dgilbert from comment #6) Thank you. Is the panic reproducible at all? (In reply to dgilbert from comment #4) I don't see anything obviously inconsistent, except: popcnt(0xffffffffff000000) + popcnt(0xfffffffffffe040f) + popcnt(0x24fc0ffffc925927) + popcnt(0xffffff0847fc9249) + popcnt(0x1fffffffffffffff) = 231 and rv->popcnt = 239...
make -j8 bulidworld produced that ... but it's the only time it happened to me. make -j4 subsequently passed. There are 4 processors on the box. I can run a few more make -j8 on it. Question, tho, can I upgrade to the security patches ... or should I continue to test on this week-or-two old version?
(In reply to dgilbert from comment #8) I don't see any problem with updating first.
Here's what I've found so far. If I make -j8 with ccache full of answers, we're fine. If I make -j4, we're fine. If I make -j8 with ccache empty (but being filled), then I get 3 out of 4 (so far) buildworld have ended in a random crash and one in the panic you're looking at. The code does compile (at -j4). The system does have ZFS running on an NVMe drive. I'm going to keep trying to trigger the panic, but my feeling is the panic is only one of the possible outcomes of the error.
(In reply to dgilbert from comment #10) It would be useful to see at least the panic message and stack trace from the other panics you've hit.
The count is one panic and 3 crashes of 4 total attempts at -j8.