While testing new code with WITNESS enabled, I saw this panic: panic: acquiring blockable sleep lock with spinlock or critical section held (sleep mutex) 64 Bucket @ /usr/src/sys/vm/uma_dbg.c:217 cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe000021c690 vpanic() at vpanic+0x182/frame 0xfffffe000021c710 kassert_panic() at kassert_panic+0x126/frame 0xfffffe000021c780 witness_checkorder() at witness_checkorder+0x13b/frame 0xfffffe000021c800 __mtx_lock_flags() at __mtx_lock_flags+0xa4/frame 0xfffffe000021c850 uma_dbg_getslab() at uma_dbg_getslab+0x42/frame 0xfffffe000021c880 uma_dbg_alloc() at uma_dbg_alloc+0x36/frame 0xfffffe000021c8b0 uma_zalloc_arg() at uma_zalloc_arg+0x53e/frame 0xfffffe000021c910 bucket_alloc() at bucket_alloc+0xa6/frame 0xfffffe000021c930 uma_zfree_arg() at uma_zfree_arg+0x293/frame 0xfffffe000021c980 free() at free+0x8b/frame 0xfffffe000021c9c0 pmc_hook_handler() at pmc_hook_handler+0xbbb/frame 0xfffffe000021ca50 thread_exit() at thread_exit+0x1b7/frame 0xfffffe000021ca90 kern_thr_exit() at kern_thr_exit+0x119/frame 0xfffffe000021cac0 sys_thr_exit() at sys_thr_exit+0x62/frame 0xfffffe000021cae0 amd64_syscall() at amd64_syscall+0x2de/frame 0xfffffe000021cbf0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe000021cbf0 --- syscall (431, FreeBSD ELF64, sys_thr_exit), rip = 0x40b70a, rsp = 0x7fffde9f2ee8, rbp = 0x7fffde9f2f00 --- KDB: enter: panic The free() man page promises that free() will not sleep. However, if INVARIANTS is enabled, free() may very well attempt to acquire a sleeping lock. In fact, it appears that malloc() may even attempt to acquire a sleeping lock if INVARIANTS is enabled. The problem is that uma_zalloc_arg() calls uma_dbg_alloc(). uma_dbg_alloc() calls uma_dbg_getslab(). uma_dbg_getslab() may acquire the zone lock, which is a normal (sleepable) mutex.
I think this indicates a bug in the caller rather than UMA. The nomenclature is a bit confusing: a sleep mutex is just a "default" mutex, i.e. a non-spin mutex. When a thread blocks on a sleep mutex, it enters bounded sleep; "sleep" in the free(9) man page refers to unbounded sleep. The assertion is failing because the thread holds a spin mutex or a critical section, in which case it is not valid to try and acquire a sleep mutex. It could probably be triggered in a non-INVARIANTS kernel too, since uma_zfree_arg() will attempt to acquire the corresponding zone lock, which is also a sleep mutex.
(In reply to Mark Johnston from comment #1) > The assertion is failing because the thread holds a spin mutex or a critical section, > in which case it is not valid to try and acquire a sleep mutex. It could probably be > triggered in a non-INVARIANTS kernel too, since uma_zfree_arg() will attempt to > acquire the corresponding zone lock, which is also a sleep mutex. Good point. That makes this very much look like intended behavior. I only saw this when I tried testing my changes with an unusually high load. Presumably, that caused the allocator to need to acquire the zone lock when it would normally not need to do so. I wonder how many other things have slipped through without enough testing to actually trigger the assert? Perhaps, I should propose a man page change to make this more clear. And/or add a check in malloc/free to catch these problems more reliably.
(In reply to Jonathan T. Looney from comment #2) I think it would be correct and reasonable to assert that curthread->td_critnest == 0 at the beginning of uma_zalloc_arg() and uma_zfree_arg(). In most cases, UMA can satisfy allocation and free requests without acquiring a lock, so the checking currently done by witness is not consistent.
> Perhaps, I should propose a man page change to make this more clear. > And/or add a check in malloc/free to catch these problems more reliably. Indeed, it would be useful to have this clarified in the man page, and an explicit check under INVARIANTS in malloc/free would be sensible.
I have a proposed change out for review. See https://reviews.freebsd.org/D4197.
A commit references this bug: Author: jtl Date: Thu Nov 19 14:04:54 UTC 2015 New revision: 291074 URL: https://svnweb.freebsd.org/changeset/base/291074 Log: Consistently enforce the restriction against calling malloc/free when in a critical section. uma_zalloc_arg()/uma_zalloc_free() may acquire a sleepable lock on the zone. The malloc() family of functions may call uma_zalloc_arg() or uma_zalloc_free(). The malloc(9) man page currently claims that free() will never sleep. It also implies that the malloc() family of functions will not sleep when called with M_NOWAIT. However, it is more correct to say that these functions will not sleep indefinitely. Indeed, they may acquire a sleepable lock. However, a developer may overlook this restriction because the WITNESS check that catches attempts to call the malloc() family of functions within a critical section is inconsistenly applied. This change clarifies the language of the malloc(9) man page to clarify the restriction against calling the malloc() family of functions while in a critical section or holding a spin lock. It also adds KASSERTs at appropriate points to make the enforcement of this restriction more consistent. PR: 204633 Differential Revision: https://reviews.freebsd.org/D4197 Reviewed by: markj Approved by: gnn (mentor) Sponsored by: Juniper Networks Changes: head/share/man/man9/malloc.9 head/sys/kern/kern_malloc.c head/sys/vm/uma_core.c
This was "working as intended". But, we used this opportunity to clarify that man page and add extra checks to catch these problems more reliably.