Bug 256296 - "No Device" error after using asynchronous API of libusb once
Summary: "No Device" error after using asynchronous API of libusb once
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: usb (show other bugs)
Version: Unspecified
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-usb (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-05-31 15:24 UTC by Sergii
Modified: 2021-06-11 16:42 UTC (History)
2 users (show)

See Also:


Attachments
Bug reproducer (1.10 KB, text/plain)
2021-05-31 21:10 UTC, Sergii
no flags Details
Better bug reproducer (1.37 KB, text/plain)
2021-06-05 22:30 UTC, Sergii
no flags Details
libusb patch that fixes the race (657 bytes, patch)
2021-06-08 15:23 UTC, Sergii
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sergii 2021-05-31 15:24:47 UTC
Background information
----------------------

fwupd project is using libgusb, which is a Glib wrapper for libusb.

libgusb uses asynchronous API of libusb with events handled in a separate thread.

fwupd works with devices by opening them, performing operations and then closing.

How the failure occurs
----------------------

fwupd closes a device before all events of the operation have been processed. This means that when libusb10_handle_events_sub() eventually processes related event, the device gets marked as gone (device_is_gone field). On all future calls libusb_submit_transfer() fails with LIBUSB_ERROR_NO_DEVICE because that flag is set.

Effectively this means that a device can be used only once in such a way.

As I understand libgusb uses the API as intended and it is implementation of libusb which shouldn't prevent repeated use of a device. Maybe it should wait before closing or just reset the flag on opening the device.
Comment 1 Sergii 2021-05-31 15:27:30 UTC
Forgot to say that this happens on 12.2, but I don't think it has changed since.
Comment 2 Hans Petter Selasky freebsd_committer 2021-05-31 15:32:28 UTC
Hi,

I think this is a bug in the the application.

If you close the USB device while transfers are pending, the completion events won't be delivered. I guess you need to wait for the pending transfers to complete before you close the device.

--HPS
Comment 3 Sergii 2021-05-31 15:49:53 UTC
Hi,

I don't know what kind of event comes and might need to investigate callbacks more, but why mark device as gone in this case? It's not gone, it's just closed.
Comment 4 Tomasz "CeDeROM" CEDRO 2021-05-31 17:21:11 UTC
Hey Sergii, HPS :-) 

Sergii is it possible to open device again after closing? It should not be possible to transfer to a device that is not opened.. so LIBUSB_ERROR_NO_DEVICE may indicate that device is not opened and there is nowhere to tranfer to. Asynchronous transfers may be put on some sort of list and close function should verify the list before closing..? Can you verify with the application? :-)

In order to analyse situation in depth Wireshark can dump the USB traffic. You may want to compare Linux and FreeBSD behaviour if you suspect (low pobable) problem in the stack :-)
Comment 5 Sergii 2021-05-31 21:10:30 UTC
Created attachment 225431 [details]
Bug reproducer

> Sergii is it possible to open device again after closing? It should not be possible to transfer to a device that is not opened.. so LIBUSB_ERROR_NO_DEVICE may indicate that device is not opened and there is nowhere to tranfer to.

Yes, device can be opened. Transfer is done after opening the device, that's not the issue.

> Asynchronous transfers may be put on some sort of list and close function should verify the list before closing..? Can you verify with the application? :-)

Actually looks like the first time the error happens even without any transactions (no libusb_submit_transfer() calls). The device is opened, some info calls are done, it's closed, second thread wakes up, finds that 32 bytes are available to read and marks the device as gone.

I've made a simple example where the behaviour manifests itself in debugger almost every time. Setting breakpoints helps with that. In fwupd breakpoint on libusb_close() isn't necessary.
Comment 6 Sergii 2021-06-05 22:30:37 UTC
Created attachment 225579 [details]
Better bug reproducer

Made a better example that showcases the bug. No help from debugger is necessary. Demonstrates the issue 100% runs for me. Device is opened and closed twice. Second time its opened there is a transfer attempt which always fails (returns "no device" error).

The API is used properly as far as I can tell and the code "works" if you remove sleep() calls, but doesn't if you leave them. I'm pretty confident this is a bug in libusb.

Another suggestion for a fix: before handling an event check whether device is still in the polling queue and if not, ignore the event. This might be the best fix because the issue is that device is being polled after it was closed (background thread was blocked on poll() at that time), so any events from this device should be rightfully discarded as if it wasn't on the polling queue when the event has arrived.
Comment 7 Sergii 2021-06-08 15:23:31 UTC
Created attachment 225640 [details]
libusb patch that fixes the race

Adding a simple condition that the device must be opened to be considered gone fixes the issue. I might be missing something about detecting unexpectedly detached devices, but this way firmware update succeeds instead of failing right at the start.
Comment 8 commit-hook freebsd_committer 2021-06-11 15:08:01 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=6847ea50196f1a685be408a24f01cb8d407da19c

commit 6847ea50196f1a685be408a24f01cb8d407da19c
Author:     Hans Petter Selasky <hselasky@FreeBSD.org>
AuthorDate: 2021-06-11 15:06:10 +0000
Commit:     Hans Petter Selasky <hselasky@FreeBSD.org>
CommitDate: 2021-06-11 15:06:44 +0000

    Improve handling of USB device re-open in the LibUSB v1.x API.

    Make sure the "device_is_gone" flag is cleared after every successful open,
    so that the "device_is_gone" flag doesn't persist forever.

    Found by:       sergii.dmytruk@3mdeb.com
    PR:             256296
    MFC after:      1 week
    Sponsored by:   Mellanox Technologies // NVIDIA Networking

 lib/libusb/libusb10.c    |  9 +++++++++
 lib/libusb/libusb10_io.c | 12 ++++++++++--
 lib/libusb/libusb20.c    |  5 +++++
 3 files changed, 24 insertions(+), 2 deletions(-)
Comment 9 Hans Petter Selasky freebsd_committer 2021-06-11 15:08:25 UTC
OK, got it.

Please verify that the submitted patch works for you!

Thank you!

--HPS
Comment 10 Sergii 2021-06-11 16:42:06 UTC
Yes, patch from the commit works fine.

Thank you!