First observed with Lua and cqueue and reproduced on both 12.0-RELEASE as well as -CURRENT. It was reported that this is a regression from 11 -> 12.
Fairly minimal test case ; an application that dlopen() an .so linked against libthr and invokes a function in that .so that creates a new thread. The new thread appears to be created (see  for truss output) but there we stall without ever entering thread_start in the new thread.
CC kib@ because it looks like an rtld bug, perhaps.
A commit references this bug:
Date: Tue Jan 29 22:46:46 UTC 2019
New revision: 343566
Untangle jemalloc and mutexes initialization.
The need to use libc malloc(3) from some places in libthr always
caused issues. For instance, per-thread key allocation was switched to
use plain mmap(2) to get storage, because some third party mallocs
used keys for implementation of calloc(3).
Even more important, libthr calls calloc(3) during initialization of
pthread mutexes, and jemalloc uses pthread mutexes. Jemalloc provides
some way to both postpone the initialization, and to make
initialization to use specialized allocator, but this is very fragile
and often breaks. See the referenced PR for another example.
Add the small malloc implementation used by rtld, to libthr. Use it in
thr_spec.c and for mutexes initialization. This avoids the issues with
mutual dependencies between malloc and libthr in principle. The
drawback is that some more allocations are not interceptable for
alternate malloc implementations. There should be not too much memory
use from this allocator, and the alternative, direct use of mmap(2) is
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D18988
There's a report on the ARM list of a crash in /rescue/* on armv7 pointing at this new code. Offhand, isn't it a problem that handle_static_init calls atexit() which calls pthread_mutex_lock, before libthr's initialization is run?
link to message: