Bug 248500

Summary: cam_sim_free() sleeping forever
Product: Base System Reporter: Bjoern A. Zeeb <bz>
Component: usbAssignee: freebsd-usb (Nobody) <usb>
Status: New ---    
Severity: Affects Some People CC: bsdimp, cem, grahamperrin, imp
Priority: --- Keywords: cam
Version: CURRENT   
Hardware: amd64   
OS: Any   

Description Bjoern A. Zeeb freebsd_committer freebsd_triage 2020-08-06 14:33:09 UTC
Hi,

I have a system with a locked USB because cam_sim_free() seems to sleep forever (at least 24 hours by now):

# procstat -akk | grep -i usb
   15 100045 usb                 usbus0              mi_switch+0xc1 _cv_wait+0xf2 usb_process+0x101 fork_exit+0x7e fork_trampoline+0xe
   15 100046 usb                 usbus0              mi_switch+0xc1 _cv_wait+0xf2 usb_process+0x101 fork_exit+0x7e fork_trampoline+0xe
   15 100047 usb                 usbus0              mi_switch+0xc1 _cv_wait+0xf2 usb_process+0x101 fork_exit+0x7e fork_trampoline+0xe
   15 100048 usb                 usbus0              mi_switch+0xc1 _sleep+0x1cb cam_sim_free+0x7e umass_detach+0xd8 device_detach+0x185 device_delete_child+0x15 usb_detach_device+0x18f usb_unconfigure+0x2b usb_free_device+0x
11d uhub_explore+0x2ad usb_bus_explore+0x13e usb_process+0x13b fork_exit+0x7e fork_trampoline+0xe
   15 100049 usb                 usbus0              mi_switch+0xc1 _cv_wait+0xf2 usb_process+0x101 fork_exit+0x7e fork_trampoline+0xe


0x2be is in cam_sim_free (/usr/src/head.svn/sys/cam/cam_sim.c:142).
137             if (sim->refcount > 0) {
138                     error = msleep(sim, mtx, PRIBIO, "simfree", 0);
139                     KASSERT(error == 0, ("invalid error value for msleep(9)"));
140             }
141             KASSERT(sim->refcount == 0, ("sim->refcount == 0"));
142             if (sim->mtx == NULL)
143                     mtx_unlock(mtx);
144
145             if (free_devq)
146                     cam_simq_free(sim->devq);


What should wake us up in this case?  Was there a race somewhere?  Any ideas?
Comment 1 Conrad Meyer freebsd_committer freebsd_triage 2020-08-06 15:33:04 UTC
I don't remember if it was in cam_sim_free, but I've experienced a similar refcount leak when removing a USB device in the past.  Eventually something else blocks on one of the deadlocks and the system dies.

I did some cursory debugging, but didn't identify the problem: https://reviews.freebsd.org/P300
Comment 2 Warner Losh freebsd_committer freebsd_triage 2022-10-19 17:47:13 UTC
Since this bug was filed, I've fixed at least one refcont bug. Does it happen in -current?