Created attachment 182301 [details] dmesg.boot after panic This is an ASROCK Q2900 motherboard (http://www.asrock.com/mb/intel/q2900-itx/). Had similar panics with 10-STABLE and 11-STABLE and in the end replaced both memory banks with new SO-DIMMS (Kingston KVR16LS11K2 in a 2-kit that's on their supported memory list) and still received this. So I am excluding memory stick issues. The problem only seems to occur when under heavy load, say compiling ports with poudriere or building world. Panic details: panic: vm_radix_remove: invalid key found cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0455872360 vpanic() at vpanic+0x186/frame 0xfffffe04558723e0 panic() at panic+0x43/frame 0xfffffe0455872440 vm_radix_remove() at vm_radix_remove+0x200/frame 0xfffffe0455872450 vm_page_free_toq() at vm_page_free_toq+0x56/frame 0xfffffe0455872470 vm_object_page_remove() at vm_object_page_remove+0x145/frame 0xfffffe04558724d0 vnode_pager_setsize() at vnode_pager_setsize+0xfd/frame 0xfffffe0455872520 trunc_pages() at trunc_pages+0x126/frame 0xfffffe0455872580 softdep_setup_freeblocks() at softdep_setup_freeblocks+0x9d6/frame 0xfffffe0455872600 ffs_truncate() at ffs_truncate+0x32e/frame 0xfffffe0455872760 ufs_inactive() at ufs_inactive+0x1c7/frame 0xfffffe04558727a0 VOP_INACTIVE_APV() at VOP_INACTIVE_APV+0x89/frame 0xfffffe04558727d0 vinactive() at vinactive+0xf2/frame 0xfffffe0455872830 vputx() at vputx+0x2c5/frame 0xfffffe0455872890 kern_unlinkat() at kern_unlinkat+0x22a/frame 0xfffffe0455872a60 amd64_syscall() at amd64_syscall+0x6c4/frame 0xfffffe0455872bf0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0455872bf0 --- syscall (10, FreeBSD ELF64, sys_unlink), rip = 0x8009756aa, rsp = 0x7fffffffe718, rbp = 0x7fffffffe7e0 --- KDB: enter: panic (kgdb) bt #0 doadump (textdump=0) at pcpu.h:222 #1 0xffffffff8030b2ab in db_dump (dummy=<value optimized out>, dummy2=<value optimized out>, dummy3=<value optimized out>, dummy4=<value optimized out>) at /usr/src/sys/ddb/db_command.c:533 #2 0xffffffff8030b09f in db_command (cmd_table=<value optimized out>) at /usr/src/sys/ddb/db_command.c:440 #3 0xffffffff8030add4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:493 #4 0xffffffff8030de8f in db_trap (type=<value optimized out>, code=<value optimized out>) at /usr/src/sys/ddb/db_main.c:248 #5 0xffffffff8054b833 in kdb_trap (type=3, code=-61456, tf=<value optimized out>) at /usr/src/sys/kern/subr_kdb.c:654 #6 0xffffffff807ed79f in trap (frame=0xfffffe0455872290) at /usr/src/sys/amd64/amd64/trap.c:537 #7 0xffffffff807d3111 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236 #8 0xffffffff8054af6b in kdb_enter (why=0xffffffff808f0e3a "panic", msg=<value optimized out>) at cpufunc.h:63 #9 0xffffffff80509303 in vpanic (fmt=<value optimized out>, ap=0xfffffe0455872420) at /usr/src/sys/kern/kern_shutdown.c:752 #10 0xffffffff80509153 in panic (fmt=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:690 #11 0xffffffff807c88c0 in vm_radix_remove (rtree=0x0, index=18446741893301083792) at uma.h:364 #12 0xffffffff807bcd56 in vm_page_free_toq (m=0xfffff804385afe30) at /usr/src/sys/vm/vm_page.c:1287 #13 0xffffffff807b9055 in vm_object_page_remove (object=<value optimized out>, start=<value optimized out>, end=119283, options=<value optimized out>) at /usr/src/sys/vm/vm_object.c:1948 #14 0xffffffff807cbbcd in vnode_pager_setsize (vp=<value optimized out>, nsize=0) at /usr/src/sys/vm/vnode_pager.c:425 #15 0xffffffff8076edd6 in trunc_pages (ip=<value optimized out>, length=0, extblocks=<value optimized out>, flags=<value optimized out>) at /usr/src/sys/ufs/ffs/ffs_softdep.c:7015 #16 0xffffffff8076fb56 in softdep_setup_freeblocks (ip=0xfffff8017d859000, length=<value optimized out>, flags=<value optimized out>) at /usr/src/sys/ufs/ffs/ffs_softdep.c:6981 #17 0xffffffff807599ce in ffs_truncate (vp=<value optimized out>, length=<value optimized out>, flags=<value optimized out>, cred=<value optimized out>) at /usr/src/sys/ufs/ffs/ffs_inode.c:393 #18 0xffffffff8078d507 in ufs_inactive (ap=<value optimized out>) at /usr/src/sys/ufs/ufs/ufs_inode.c:131 #19 0xffffffff808804b9 in VOP_INACTIVE_APV (vop=<value optimized out>, a=0xfffffe04558727e8) at vnode_if.c:1955 #20 0xffffffff805d1022 in vinactive (vp=0xfffff8019851c1d8, td=0xfffff802e2097560) at vnode_if.h:807 #21 0xffffffff805d1555 in vputx (vp=0xfffff8019851c1d8, func=2) at /usr/src/sys/kern/vfs_subr.c:2715 #22 0xffffffff805db81a in kern_unlinkat (td=0xfffff802e2097560, fd=-100, path=0x800e5d908 <Address 0x800e5d908 out of bounds>, pathseg=UIO_USERSPACE, oldinum=<value optimized out>) at /usr/src/sys/kern/vfs_syscalls.c:1791 #23 0xffffffff807eea74 in amd64_syscall (td=0xfffff802e2097560, traced=0) at subr_syscall.c:135 #24 0xffffffff807d33fb in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:396 #25 0x00000008009756aa in ?? () (kgdb) fr 11 #11 0xffffffff807c88c0 in vm_radix_remove (rtree=0x0, index=18446741893301083792) at uma.h:364 364 uma_zfree_arg(zone, item, NULL); (kgdb) p/x index $2 = 0xfffffe0455871e90 (kgdb) fr 12 #12 0xffffffff807bcd56 in vm_page_free_toq (m=0xfffff804385afe30) at /usr/src/sys/vm/vm_page.c:1287 1287 vm_radix_remove(&object->rtree, m->pindex); (kgdb) p (struct vm_page)m $7 = {plinks = {q = {tqe_next = 0xfffff804385afe30, tqe_prev = 0x0}, s = {ss = {sle_next = 0xfffff804385afe30}, pv = 0x0}, memguard = {p = 18446735295741886000, v = 0}}, listq = {tqe_next = 0xfffff802867fed68, tqe_prev = 0xfffff804385afe98}, object = 0xfffffe0455872440, pindex = 18446744071567348051, phys_addr = 18446741893301085264, md = {pv_list = {tqh_first = 0xffffffff8092c3ed, tqh_last = 0xe}, pv_gen = 979041840, pat_mode = -2044}, wire_count = 2178760728, busy_lock = 4294967295, hold_count = 3, flags = 0, aflags = 0 '\0', oflags = 0 '\0', queue = 0 '\0', psind = 0 '\0', segind = 16 '\020', order = 0 '\0', pool = 0 '\0', act_count = 0 '\0', valid = 48 '0', dirty = 0 '\0'} (kgdb) p/x 18446735295741886000 $8 = 0xfffff804385afe30 (kgdb) p/x 18446744071567348051 $9 = 0xffffffff80509153 (kgdb) p/x 18446741893301085264 $10 = 0xfffffe0455872450 (kgdb) fr 13 #13 0xffffffff807b9055 in vm_object_page_remove (object=<value optimized out>, start=<value optimized out>, end=119283, options=<value optimized out>) at /usr/src/sys/vm/vm_object.c:1948 1948 vm_page_free(p); (kgdb) print (struct vm_page)p $5 = {plinks = {q = {tqe_next = 0xfffff804385afe98, tqe_prev = 0xfffffe0455872440}, s = {ss = {sle_next = 0xfffff804385afe98}, pv = 0xfffffe0455872440}, memguard = {p = 18446735295741886104, v = 18446741893301085248}}, listq = { tqe_next = 0xffffffff80509153, tqe_prev = 0xfffffe0455872450}, object = 0xffffffff8092c3ed, pindex = 14, phys_addr = 18446735295775440432, md = {pv_list = {tqh_first = 0xffffffff81dd4018, tqh_last = 0x3}, pv_gen = 16, pat_mode = 48}, wire_count = 1434920016, busy_lock = 4294966788, hold_count = 9200, flags = 21895, aflags = 4 '\004', oflags = 254 '', queue = 255 '', psind = -1 '', segind = 0 '\0', order = 0 '\0', pool = 0 '\0', act_count = 0 '\0', valid = 0 '\0', dirty = 0 '\0'} (kgdb) p/x 18446735295741886104 $14 = 0xfffff804385afe98
The page in frame 12 looks like a garbage: the wire count practically cannot be that large, pat mode also looks wrong. This might be a cause of the panic you see, or it might be due to gdb troubles. Did you compiled your kernel with INVARIANTS and WITNESS ? If not, enable these options and try to reproduce the problem. I wonder if it triggers something else faster. Also, do you use any third-party modules ?
(In reply to Konstantin Belousov from comment #1) No third party modules: % kldstat Id Refs Address Size Name 1 25 0xffffffff80200000 1e08780 kernel 2 1 0xffffffff82221000 593e fdescfs.ko 3 1 0xffffffff82227000 6683 nullfs.ko 4 1 0xffffffff8222e000 a865 linprocfs.ko 5 3 0xffffffff82239000 7b44 linux_common.ko 6 1 0xffffffff82241000 bb08 tmpfs.ko 7 1 0xffffffff8224d000 41d65 linux.ko 8 1 0xffffffff8228f000 3bea5 linux64.ko I had INVARIANTS and WITNESS on during 10-STABLE, but turned it off during my entire upgrade to 11-STABLE. I'll turn them on and run another poudriere llvm build to see if I can trigger it again. The wire_count is the wired page count statistic, I assume?
(In reply to Jeroen Ruigrok van der Werven from comment #2) It is a counter of number of times the page is currently wired. Each user must unwire the page, eventually.
Of course, with INVARIANTS and WITNESS currently in the kernel the system stays stable. Maybe I should remove WITNESS at some point and just run with INVARIANTS, see how it behaves. And maybe even ditch INVARIANTS at some point. If it stays stable then it must have been a fluke.
Will test this week. Haven't had time yet.