The CAM xpt_done_td is marked THREAD_NO_SLEEPING. This is problematic since the AC_FOUND_DEVICE async events still call functions such as disk_alloc() and devstat_alloc() that malloc with M_WAITOK, so they could sleep, which panics the kernel. You can spot the problem more easily by adding an ASSERT in malloc that checks for M_WAITOK and THREAD_CAN_SLEEP, and then removing and re-adding a device at run-time. At least with mps, the initial device creation works since it runs from dainit() in an intr config hook. I'll attach a patch with the assertion that highlights the problem.
Created attachment 147387 [details] malloc M_WAITOK patch to assert THREAD_CAN_SLEEP
I'm testing out the patch in an 11-CURRENT VM to make sure it doesn't break the VM case at least. I'll send it out for review to alc/kib.
Actually, I'm going to put the bug back in Needs Triage state because the patch above makes the issue apparent -- the larger issue Scott brought up needs to be fixed.
Kib had some feedback on the assert: 1. We should also add it (and the interrupt check) to uma_zalloc_arg() (through 1 inline function) 2. The interrupt assert may be wrong since it is not OK to malloc(9) in an interrupt, regardless of the flags. Isilon's internal discussion was that we should add a debug stack output rather than an assert until all major cases are fixed.
The patch Scott provided looked ok (doesn't panic on boot with simple cases with a VM), but I didn't get an opportunity to test it out more extensively. I didn't try out a patch that incorporates the feedback noted in comment # 4 yet.
Trace from xpt_done_td from pulling a device out of the system: KASSERT failed: malloc(M_WAITOK) in no_sleeping context KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe349829a340 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe349829a3f0 _kassert_panic() at _kassert_panic+0xd7/frame 0xfffffe349829a470 malloc() at malloc+0x2e4/frame 0xfffffe349829a4c0 g_post_event_x() at g_post_event_x+0x84/frame 0xfffffe349829a510 g_post_event() at g_post_event+0x5d/frame 0xfffffe349829a580 adacleanup() at adacleanup+0x62/frame 0xfffffe349829a5a0 cam_periph_release_locked_buses() at cam_periph_release_locked_buses+0xde/frame 0xfffffe349829aaa0 cam_periph_release_locked() at cam_periph_release_locked+0x1b/frame 0xfffffe349829aac0 adadone() at adadone+0x26e/frame 0xfffffe349829ab20 xpt_done_process() at xpt_done_process+0x3a4/frame 0xfffffe349829ab60 xpt_done_td() at xpt_done_td+0x136/frame 0xfffffe349829abb0 fork_exit() at fork_exit+0x84/frame 0xfffffe349829abf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe349829abf0
https://reviews.freebsd.org/D829 - KASSERT_WARN https://reviews.freebsd.org/D830 - Use KASSERT_WARN in malloc(9) and uma_zalloc_arg(9)