Bug 262882 - USB disconnects repeatedly, losing all attached devices on that USB hub
Summary: USB disconnects repeatedly, losing all attached devices on that USB hub
Status: Closed Feedback Timeout
Alias: None
Product: Base System
Classification: Unclassified
Component: usb (show other bugs)
Version: 13.1-RELEASE
Hardware: amd64 Any
: --- Affects Some People
Assignee: Mark Linimon
URL:
Keywords:
: 263661 (view as bug list)
Depends on:
Blocks:
 
Reported: 2022-03-28 11:13 UTC by Dave Cottlehuber
Modified: 2024-01-10 03:26 UTC (History)
12 users (show)

See Also:


Attachments
dmesg (95.07 KB, text/plain)
2022-03-28 11:13 UTC, Dave Cottlehuber
no flags Details
output of usbconfig dump_all_desc (after reboot) (83.07 KB, text/plain)
2022-03-28 12:20 UTC, Dave Cottlehuber
no flags Details
after issue recurred, without debug flags (94.97 KB, text/plain)
2022-04-01 08:32 UTC, Dave Cottlehuber
no flags Details
with debugging flag enabled, & unplugging all the USB peripherals, finally only re-adding the mouse/keyboard (95.87 KB, text/plain)
2022-04-01 08:33 UTC, Dave Cottlehuber
no flags Details
before setting both sysctls but after keyboard went awol (95.89 KB, text/plain)
2022-04-01 22:28 UTC, Dave Cottlehuber
no flags Details
debug with both sysctls enabled=17 (95.93 KB, text/plain)
2022-04-01 22:28 UTC, Dave Cottlehuber
no flags Details
after switching debugging off again (95.93 KB, text/plain)
2022-04-01 22:29 UTC, Dave Cottlehuber
no flags Details
usb devices (audio this time) disconnected, switch to debug 17 per history before rebooting (95.93 KB, text/plain)
2022-04-18 19:13 UTC, Dave Cottlehuber
no flags Details
USB_DEBUG with hw.usb.xhci.debug=16 (593.54 KB, text/plain)
2022-05-01 06:12 UTC, Emanuel Haupt
no flags Details
pciconf -lv (6.24 KB, text/plain)
2022-05-01 06:13 UTC, Emanuel Haupt
no flags Details
usbconfig dump_device_desc of mouse/keyboard/hub (1.49 KB, text/plain)
2022-05-01 08:09 UTC, Emanuel Haupt
no flags Details
procstat -akk captured via ssh into the machine (35.77 KB, text/plain)
2022-05-02 06:30 UTC, Emanuel Haupt
no flags Details
Patch to test (1.60 KB, patch)
2022-05-02 12:49 UTC, Hans Petter Selasky
no flags Details | Diff
Patch to test (v2) (1.49 KB, patch)
2022-05-02 14:11 UTC, Hans Petter Selasky
no flags Details | Diff
Patch to test (v3) (725 bytes, patch)
2022-05-03 07:36 UTC, Hans Petter Selasky
no flags Details | Diff
Patch to test (v3 for 13-stable) (609 bytes, patch)
2022-05-03 08:12 UTC, Hans Petter Selasky
no flags Details | Diff
Patch to fix U1/U2 IOERROR issue (2.23 KB, patch)
2022-05-03 20:25 UTC, Hans Petter Selasky
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dave Cottlehuber freebsd_committer freebsd_triage 2022-03-28 11:13:52 UTC
Created attachment 232775 [details]
dmesg

happens repeatedly until keyboard/mouse are lost completely.

keyboard doesn't respond at all (numlock light doesn't toggle when pressed).

remote via ssh still works, I will see if I can get usb traces somehow.

reboot is required to fix.

very reproducible :-(

not happened under 13.0-RELEASE, just in 13.1-BETA2+.

...
[4556] uhub0: <(0x1b21) XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus3
[4556] uhub0: 4 ports with 4 removable, self powered
[4557] xhci2: Resetting controller
[4557] usb_alloc_device: set address 2 failed (USB_ERR_TIMEOUT, ignored)
[4584] usbd_setup_device_desc: getting device descriptor at addr 2 failed, USB_ERR_TIMEOUT
[4585] usbd_req_re_enumerate: addr=2, set address failed! (USB_ERR_TIMEOUT, ignored)
[4611] usbd_setup_device_desc: getting device descriptor at addr 2 failed, USB_ERR_TIMEOUT
[4613] usbd_req_re_enumerate: addr=2, set address failed! (USB_ERR_TIMEOUT, ignored)
[4632] usbd_setup_device_desc: getting device descriptor at addr 2 failed, USB_ERR_TIMEOUT
[4633] usbd_req_re_enumerate: addr=2, set address failed! (USB_ERR_TIMEOUT, ignored)
[4657] usbd_setup_device_desc: getting device descriptor at addr 2 failed, USB_ERR_TIMEOUT
[4659] usbd_req_re_enumerate: addr=2, set address failed! (USB_ERR_TIMEOUT, ignored)
[4683] usbd_setup_device_desc: getting device descriptor at addr 2 failed, USB_ERR_TIMEOUT
[4683] ugen3.2: <Unknown > at usbus3 (disconnected)
[4684] uhub_reattach_port: could not allocate new device
[4685] usb_alloc_device: device init 2 failed (USB_ERR_TIMEOUT, ignored)
[4685] ugen3.2: <Unknown > at usbus3 (disconnected)
[4685] uhub_reattach_port: could not allocate new device
[4685] uhub0: at usbus3, port 1, addr 1 (disconnected)
[4685] uhub0: detached
[4686] xhci2: Controller halt timeout.
[4686] uhub0 numa-domain 0 on usbus3
[4686] uhub0: <(0x1b21) XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus3
[4686] uhub0: 4 ports with 4 removable, self powered
[4687] xhci2: Resetting controller
[4687] usb_alloc_device: set address 2 failed (USB_ERR_TIMEOUT, ignored)
[4714] usbd_setup_device_desc: getting device descriptor at addr 2 failed, USB_ERR_TIMEOUT
[4715] usbd_req_re_enumerate: addr=2, set address failed! (USB_ERR_TIMEOUT, ignored)
[4741] usbd_setup_device_desc: getting device descriptor at addr 2 failed, USB_ERR_TIMEOUT
[4743] usbd_req_re_enumerate: addr=2, set address failed! (USB_ERR_TIMEOUT, ignored)
[4762] usbd_setup_device_desc: getting device descriptor at addr 2 failed, USB_ERR_TIMEOUT
[4763] usbd_req_re_enumerate: addr=2, set address failed! (USB_ERR_TIMEOUT, ignored)
[4787] usbd_setup_device_desc: getting device descriptor at addr 2 failed, USB_ERR_TIMEOUT
[4789] usbd_req_re_enumerate: addr=2, set address failed! (USB_ERR_TIMEOUT, ignored)
[4814] usbd_setup_device_desc: getting device descriptor at addr 2 failed, USB_ERR_TIMEOUT
[4814] ugen3.2: <Unknown > at usbus3 (disconnected)
[4814] uhub_reattach_port: could not allocate new device
[4815] usb_alloc_device: device init 2 failed (USB_ERR_TIMEOUT, ignored)
[4815] ugen3.2: <Unknown > at usbus3 (disconnected)
[4815] uhub_reattach_port: could not allocate new device
[4815] uhub0: at usbus3, port 1, addr 1 (disconnected)
[4815] uhub0: detached
[4816] xhci2: Controller halt timeout.
[4816] uhub0 numa-domain 0 on usbus3
[4816] uhub0: <(0x1b21) XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus3
[4816] uhub0: 4 ports with 4 removable, self powered
[4818] xhci2: Resetting controller
[4818] usb_alloc_device: set address 2 failed (USB_ERR_TIMEOUT, ignored)
dch@akai /t/dmesg>
Comment 1 Hans Petter Selasky freebsd_committer freebsd_triage 2022-03-28 12:02:11 UTC
Does entering:

set hw.usb.xhci.dcepquirk=1

In the loader help?

--HPS
Comment 2 Dave Cottlehuber freebsd_committer freebsd_triage 2022-03-28 12:20:16 UTC
Created attachment 232776 [details]
output of usbconfig dump_all_desc (after reboot)

I've tried setting hw.usb.xhci.debug=1 but the system is somewhat unusable
after that, video/audio/keyboard suffer huge lag. Is there something
more granular / less intrusive I can try?
Comment 3 Dave Cottlehuber freebsd_committer freebsd_triage 2022-03-28 20:07:30 UTC
set, thanks. I'll report back tomorrow on progress.
Comment 4 Dave Cottlehuber freebsd_committer freebsd_triage 2022-03-31 12:40:33 UTC
I had a longer period of stability with the tunable, but still a hang soon after a video conference call in firefox this morning.

I didn't grab the logs, but I can still see the disconnects after reboot happening:

[2579] ugen0.4: <Apple Inc. iPhone> at usbus0 (disconnected)
[2579] ipheth0: at uhub1, port 11, addr 9 (disconnected)
[2579] ipheth0: detached
[2580] ugen0.4: <Apple Inc. iPhone> at usbus0
[2580] ipheth0 numa-domain 0 on uhub1
[2580] ipheth0: <Apple Inc. iPhone, class 0/0, rev 2.00/8.02, addr 10> on usbus0
[2580] ue0: <USB Ethernet> on ipheth0
[2580] ue0: bpf attached
[2580] ue0: Ethernet address: 82:ed:2c:45:8e:f7
[3138] ugen0.4: <Apple Inc. iPhone> at usbus0 (disconnected)
[3138] ipheth0: at uhub1, port 11, addr 10 (disconnected)
[3138] ipheth0: detached
[10567] ugen0.4: <Apple Inc. iPhone> at usbus0
[10567] ipheth0 numa-domain 0 on uhub1
[10567] ipheth0: <Apple Inc. iPhone, class 0/0, rev 2.00/8.02, addr 11> on usbus0
[10567] ue0: <USB Ethernet> on ipheth0
[10567] ue0: bpf attached
[10567] ue0: Ethernet address: 82:ed:2c:45:8e:f7
[11031] ugen0.4: <Apple Inc. iPhone> at usbus0 (disconnected)
[11031] ipheth0: at uhub1, port 11, addr 11 (disconnected)
[11031] ipheth0: detached
[11057] ugen0.4: <Apple Inc. iPhone> at usbus0
[11057] ipheth0 numa-domain 0 on uhub1
[11057] ipheth0: <Apple Inc. iPhone, class 0/0, rev 2.00/8.02, addr 12> on usbus0
[11057] ue0: <USB Ethernet> on ipheth0
[11057] ue0: bpf attached
[11057] ue0: Ethernet address: 82:ed:2c:45:8e:f7
[11132] ugen3.3: <vendor 0x04d9 USB Keyboard> at usbus3 (disconnected)
[11132] ukbd0: at uhub3, port 4, addr 2 (disconnected)
[11132] ukbd0: detached
[11132] uhid0: at uhub3, port 4, addr 2 (disconnected)
[11132] uhid0: detached
[11132] ugen3.3: <vendor 0x04d9 USB Keyboard> at usbus3
[11132] ukbd0 numa-domain 0 on uhub3
[11132] ukbd0: <vendor 0x04d9 USB Keyboard, class 0/0, rev 1.10/12.09, addr 2> on usbus3
[11132] kbd2 at ukbd0
[11132] kbd2: ukbd0, generic (0), config:0x0, flags:0x3d0000
[11132] uhid0 numa-domain 0 on uhub3
[11132] uhid0: <vendor 0x04d9 USB Keyboard, class 0/0, rev 1.10/12.09, addr 2> on usbus3
[11637] ugen3.3: <vendor 0x04d9 USB Keyboard> at usbus3 (disconnected)
[11637] ukbd0: at uhub3, port 4, addr 2 (disconnected)
[11637] ukbd0: detached
[11637] uhid0: at uhub3, port 4, addr 2 (disconnected)
[11637] uhid0: detached
[11638] ugen3.3: <vendor 0x04d9 USB Keyboard> at usbus3
[11638] ukbd0 numa-domain 0 on uhub3
[11638] ukbd0: <vendor 0x04d9 USB Keyboard, class 0/0, rev 1.10/12.09, addr 2> on usbus3
[11638] kbd2 at ukbd0
[11638] kbd2: ukbd0, generic (0), config:0x0, flags:0x3d0000
[11638] uhid0 numa-domain 0 on uhub3
[11638] uhid0: <vendor 0x04d9 USB Keyboard, class 0/0, rev 1.10/12.09, addr 2> on usbus3
[11639] ugen3.3: <vendor 0x04d9 USB Keyboard> at usbus3 (disconnected)
[11639] ukbd0: at uhub3, port 4, addr 2 (disconnected)
[11639] ukbd0: detached
[11639] uhid0: at uhub3, port 4, addr 2 (disconnected)
[11639] uhid0: detached
[11639] ugen3.3: <vendor 0x04d9 USB Keyboard> at usbus3
[11639] ukbd0 numa-domain 0 on uhub3
[11639] ukbd0: <vendor 0x04d9 USB Keyboard, class 0/0, rev 1.10/12.09, addr 2> on usbus3
[11639] kbd2 at ukbd0
[11639] kbd2: ukbd0, generic (0), config:0x0, flags:0x3d0000
[11639] uhid0 numa-domain 0 on uhub3
[11639] uhid0: <vendor 0x04d9 USB Keyboard, class 0/0, rev 1.10/12.09, addr 2> on usbus3
[11665] ugen0.4: <Apple Inc. iPhone> at usbus0 (disconnected)
[11665] ipheth0: at uhub1, port 11, addr 12 (disconnected)
[11665] ipheth0: detached
[12062] ugen3.3: <vendor 0x04d9 USB Keyboard> at usbus3 (disconnected)
[12062] ukbd0: at uhub3, port 4, addr 2 (disconnected)
[12062] ukbd0: detached
[12062] uhid0: at uhub3, port 4, addr 2 (disconnected)
[12062] uhid0: detached
[12062] ugen3.3: <vendor 0x04d9 USB Keyboard> at usbus3
[12062] ukbd0 numa-domain 0 on uhub3
[12062] ukbd0: <vendor 0x04d9 USB Keyboard, class 0/0, rev 1.10/12.09, addr 2> on usbus3
[12062] kbd2 at ukbd0
[12062] kbd2: ukbd0, generic (0), config:0x0, flags:0x3d0000
[12062] uhid0 numa-domain 0 on uhub3
[12062] uhid0: <vendor 0x04d9 USB Keyboard, class 0/0, rev 1.10/12.09, addr 2> on usbus3
[12926] ugen0.4: <Apple Inc. iPhone> at usbus0
Comment 5 Hans Petter Selasky freebsd_committer freebsd_triage 2022-03-31 13:19:01 UTC
Could you enable:

sysctl hw.usb.uhub.debug=17

When this happens and also capture the resulting prints.

Assuming you have "options USB_DEBUG" in the kernel configuration file.

--HPS
Comment 6 Dave Cottlehuber freebsd_committer freebsd_triage 2022-04-01 08:32:54 UTC
Created attachment 232856 [details]
after issue recurred, without debug flags
Comment 7 Dave Cottlehuber freebsd_committer freebsd_triage 2022-04-01 08:33:51 UTC
Created attachment 232857 [details]
with debugging flag enabled, & unplugging all the USB peripherals, finally only re-adding the mouse/keyboard
Comment 8 Hans Petter Selasky freebsd_committer freebsd_triage 2022-04-01 08:51:01 UTC
Comment on attachment 232857 [details]
with debugging flag enabled, & unplugging all the USB peripherals, finally only re-adding the mouse/keyboard

"wPortChange=0x0020" might indicate a "Warm Port Reset Change (WRC)".

Could you also enable:

sysctl hw.usb.debug=17

and

sysctl hw.usb.xhci.debug=17


--HPS
Comment 9 Dave Cottlehuber freebsd_committer freebsd_triage 2022-04-01 09:45:05 UTC
is this something I can do after the issue occurs? system is unusable with these flags enabled already.
Comment 10 Hans Petter Selasky freebsd_committer freebsd_triage 2022-04-01 10:28:55 UTC
Setting:
sysctl kern.consmute=1

Might also help.

Yes, you can try enabling only when the issue appears.
Comment 11 Dave Cottlehuber freebsd_committer freebsd_triage 2022-04-01 22:28:29 UTC
Created attachment 232880 [details]
before setting both sysctls but after keyboard went awol
Comment 12 Dave Cottlehuber freebsd_committer freebsd_triage 2022-04-01 22:28:55 UTC
Created attachment 232881 [details]
debug with both sysctls enabled=17
Comment 13 Dave Cottlehuber freebsd_committer freebsd_triage 2022-04-01 22:29:22 UTC
Created attachment 232882 [details]
after switching debugging off again
Comment 14 Dave Cottlehuber freebsd_committer freebsd_triage 2022-04-01 22:30:35 UTC
this was all done under 13.1-RC1 already.
Comment 15 Hans Petter Selasky freebsd_committer freebsd_triage 2022-04-02 00:35:47 UTC
From my quick glimpse at the logs, I see something has gone wrong at the XHCI hardware level! Now we need to figure out what commands your XHCI controller rejects. Oouch!

> xhci_do_command: Command timeout!

--HPS
Comment 16 Dave Cottlehuber freebsd_committer freebsd_triage 2022-04-02 18:40:32 UTC
the mainboard is a supermicro http://www.supermicro.com/products/motherboard/Xeon/C600/X10SRA-F.cfm 

I do actually have an PCIe USB card I can drop in, I can try moving everything over to that and seeing if stuff recurs?
Comment 17 Hans Petter Selasky freebsd_committer freebsd_triage 2022-04-02 22:36:14 UTC
Please do!
Comment 18 Dave Cottlehuber freebsd_committer freebsd_triage 2022-04-07 18:49:05 UTC
well some good news, with the additional PCI card, at least I
can move the keyboard from mainboard USB ports to the PCI card
USB ports & get the keyboard back!

I still get a lockup soon after a webrtc session starts, more
logs available if that helps.

I don't have any dmesg disconnects listed since the quirk
setting is enabled.

Of note is the "state" of the webcam is still blocked until it
is physically disconnected, a normal FreeBSD reboot of the box
itself doesn't free up the webcam again.

The PCI card has keyboard, usbaudio & webcam on it; other stuff
is on the mainboard. I will see how things go with the keyboard
on the mainboard next time round.
Comment 19 Dave Cottlehuber freebsd_committer freebsd_triage 2022-04-18 19:13:47 UTC
Created attachment 233314 [details]
usb devices (audio this time) disconnected, switch to debug 17 per history before rebooting

now on 13.1-RC3 and still full lockups requiring reboot.
Comment 20 Dave Cottlehuber freebsd_committer freebsd_triage 2022-04-29 15:08:32 UTC
still on 13.1-RC5.
Comment 21 Hans Petter Selasky freebsd_committer freebsd_triage 2022-04-29 15:29:35 UTC
Wild guess, adding John Baldwin:

May there be some PCI changes related to quick system startup causing this?

--HPS
Comment 22 Tomasz "CeDeROM" CEDRO 2022-04-29 17:02:37 UTC
Hello world :-) I have the same issue when connecting USB 3.0 hub to 3.0 port on my desktop.. when connecting to 2.0 port it works fine. It is here since 13.0 (when I switched to a desktop). This is Unitek 7 port USB3.0 hub with external power supply using 3.0 A-B cable.
Comment 23 John Baldwin freebsd_committer freebsd_triage 2022-04-29 23:01:50 UTC
There are very few PCI changes in 13.1 relative to 13.0 and I don't think any of them would be relevant to this.  Dave, have you tried bisecting the kernel on stable/13 to see when it starts failing?
Comment 24 Tomasz "CeDeROM" CEDRO 2022-04-29 23:21:57 UTC
After reading the whole thread it seems in this case here problem is intermittent while for me the hub fails at connect. I considered this to be faulty hub/port. But if I could use it also on USB 3.0 port that would be great :-) Anyways I have another hub connected to USB3.0 port so not a big deal for me to have another on 2.0 port :-)
Comment 25 Emanuel Haupt freebsd_committer freebsd_triage 2022-04-30 17:07:40 UTC
I am having the same issue. For me it happened between:

releng/13.1-n250134-6b642cf5c87 # good
releng/13.1-n250141-2e9ad6042be # bad
Comment 26 Hans Petter Selasky freebsd_committer freebsd_triage 2022-04-30 19:04:49 UTC
Emanuel Haupt:

There are were few relevant changes in that delta:

Can you provide the output from dmesg, when "sysctl hw.usb.xhci.debug=16" when the issue occurs.

You need "options USB_DEBUG" in the kernel configuration.

Does this reverting/applying the commit below change anything.

Do you know if your device is attached via thunderbolt?

--HPS

commit 245d5a65f5805864881e2601190e7783057d2768
Author: Hans Petter Selasky <hselasky@FreeBSD.org>
Date:   Thu Apr 21 16:59:09 2022 +0200

    xhci(4): Ensure the so-called data toggle gets properly reset.
    
    Use the drop and enable endpoint context commands to force a reset of
    the data toggle for USB 2.0 and USB 3.0 after:
     - clear endpoint halt command (when the driver wishes).
     - set config command (when the kernel or user-space wants).
     - set alternate setting command (only affected endpoints).
    
    Some XHCI HW implementations may not allow the endpoint reset command when
    the endpoint context is not in the halted state.
    
    Reported by:            Juniper and Gary Jennejohn
    Approved by:            re (gjb)
    Sponsored by:           NVIDIA Networking
    
    (cherry picked from commit cda31e734925346328fd2369585ab3f6767ec225)
Comment 27 Hans Petter Selasky freebsd_committer freebsd_triage 2022-04-30 19:34:17 UTC
Please specify the XHCI PCI ID, as shown by "pciconf -lv".

--HPS
Comment 28 Emanuel Haupt freebsd_committer freebsd_triage 2022-05-01 06:12:58 UTC
Created attachment 233622 [details]
USB_DEBUG with hw.usb.xhci.debug=16
Comment 29 Emanuel Haupt freebsd_committer freebsd_triage 2022-05-01 06:13:21 UTC
Created attachment 233623 [details]
pciconf -lv
Comment 30 Emanuel Haupt freebsd_committer freebsd_triage 2022-05-01 06:18:51 UTC
Both mouse and keyboard are attached to a Level 1 KVM switch that is connected via USB3 (not thunderbolt).

It happens immediately, 100% reproducible once xorg starts.

Here is how I collected the output:

- Added hw.usb.xhci.debug=16 to /etc/sysctl.conf
- Rebooted machine
- Made sure the KVM Switch is active on this machine
- System boots up and I have not even the chance to switch to console
  (CTRL-ALT-F1)
- ssh into machine and collect output from /var/log/messages

Since this might be nvidia related, here is the driver I am using:

nvidia-driver-510.60.02
Comment 31 Emanuel Haupt freebsd_committer freebsd_triage 2022-05-01 06:26:47 UTC
The same happens if xorg is disabled at startup. The keyboard remains 100% unresponsive.
Comment 32 Emanuel Haupt freebsd_committer freebsd_triage 2022-05-01 08:09:43 UTC
Created attachment 233625 [details]
usbconfig dump_device_desc of mouse/keyboard/hub
Comment 33 Emanuel Haupt freebsd_committer freebsd_triage 2022-05-01 08:28:00 UTC
FWIW, same thing happens with the latest nvidia driver

# https://www.nvidia.com/en-us/drivers/unix/
FreeBSD x64
Latest Production Branch Version: 510.68.02
Comment 34 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-01 14:22:10 UTC
Emmanual Haupt,

Can you also get me:

procstat -akk

When the issue happens?

I cannot find any errors in there. Maybe the log was truncated, but I can see the XHCI is working on some USB transfers. Maybe in some kind of a loop ...

Sometimes you need to do:

sysctl kern.consmute=1

To get all prints in /var/log/messages .

--HPS
Comment 35 Emanuel Haupt freebsd_committer freebsd_triage 2022-05-02 06:29:14 UTC
> When the issue happens?

Always, as in: I boot the system and the keyboard never works. I see the login prompt and the keyboard is unresponsive.
Comment 36 Emanuel Haupt freebsd_committer freebsd_triage 2022-05-02 06:30:50 UTC
Created attachment 233647 [details]
procstat -akk captured via ssh into the machine
Comment 37 Emanuel Haupt freebsd_committer freebsd_triage 2022-05-02 06:33:44 UTC
/var/log/messages after sysctl kern.consmute=1:

https://critical.ch/people/262882/messages.txt

(external link because of size)
Comment 38 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-02 06:51:11 UTC
Emanuel Haupt:

What does "usbconfig" output when this issue happens?

--HPS
Comment 39 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-02 06:55:05 UTC
Emmanuel Haupt:

The only thing I see in the logs is a mass storage device. Maybe it is a auto-installer disk?

Can you try to run:

cdcontrol -f /dev/xxx eject

On this device?

--HPS
Comment 40 Emanuel Haupt freebsd_committer freebsd_triage 2022-05-02 08:45:55 UTC
The USB device you're seeing is the USB boot device I've created. It's plugged to a different USB 2.0 port. I wouldn't want to eject it :-)

You keep writing "when it happens". I want to stress this once again. It does not suddenly "happen" the keyboard does not work from the start when I boot the system.

All the output I'm providing here is obtained via SSH into the machine:

root@pr262882:~ # usbconfig
ugen2.1: <Intel XHCI root HUB> at usbus2, cfg=0 md=HOST spd=SUPER (5.0Gbps) pwr=SAVE (0mA)
ugen4.1: <(0x1b21) XHCI root HUB> at usbus4, cfg=0 md=HOST spd=SUPER (5.0Gbps) pwr=SAVE (0mA)
ugen5.1: <Intel EHCI root HUB> at usbus5, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen1.1: <(0x1b21) XHCI root HUB> at usbus1, cfg=0 md=HOST spd=SUPER (5.0Gbps) pwr=SAVE (0mA)
ugen3.1: <Intel EHCI root HUB> at usbus3, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen0.1: <(0x10de) XHCI root HUB> at usbus0, cfg=0 md=HOST spd=SUPER (5.0Gbps) pwr=SAVE (0mA)
ugen4.2: <GenesysLogic USB3.0 Hub> at usbus4, cfg=0 md=HOST spd=SUPER (5.0Gbps) pwr=SAVE (0mA)
ugen5.2: <vendor 0x8087 product 0x0024> at usbus5, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen3.2: <vendor 0x8087 product 0x0024> at usbus3, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen4.3: <GenesysLogic USB2.0 Hub> at usbus4, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (100mA)
ugen2.2: <Genesys USB Reader> at usbus2, cfg=0 md=HOST spd=SUPER (5.0Gbps) pwr=ON (224mA)
ugen3.3: <vendor 0x05e3 USB2.0 Hub> at usbus3, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (100mA)
ugen5.3: <vendor 0x05e3 USB2.0 Hub> at usbus5, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (100mA)
ugen3.4: <USB SanDisk 3.2Gen1> at usbus3, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON (224mA)
ugen5.4: <Logitech Logitech Wireless Headset> at usbus5, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (144mA)
ugen3.5: <vendor 0x05e3 USB2.0 Hub> at usbus3, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (100mA)
ugen5.5: <vendor 0x05e3 USB2.0 Hub> at usbus5, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (100mA)
ugen3.6: <Corsair Memory, Inc. Integrated USB Bridge> at usbus3, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (100mA)

root@pr262882:~ # uname -a
FreeBSD pr262882.local 13.1-RC5 FreeBSD 13.1-RC5 releng/13.1-n250141-2e9ad6042be PR262882 amd64
Comment 41 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-02 09:21:15 UTC
Hi Emmanuel,

ugen4.1: <(0x1b21) XHCI root HUB> at usbus4, cfg=0 md=HOST spd=SUPER (5.0Gbps) ugen4.2: <GenesysLogic USB3.0 Hub> at usbus4, cfg=0 md=HOST spd=SUPER (5.0Gbps) ugen4.3: <GenesysLogic USB2.0 Hub> at usbus4, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (100mA)

I suspect the KVM USB keyboard and mouse is supposed to reside under ugen4.3 .

What happens if you run this command:

usbconfig -d ugen4.3 reset

Does any more devices show up?

You can also try:

usbconfig -d ugen4.1 reset

--HPS
Comment 42 Emanuel Haupt freebsd_committer freebsd_triage 2022-05-02 10:18:59 UTC
After:

usbconfig -d ugen4.3 reset

Mouse and keyboard work again.

Here is what I captured in /var/log/messages:

https://critical.ch/people/262882/messages-after-reset
Comment 43 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-02 10:31:50 UTC
Emanuel Haupt:

Now disable the XHCI debugging.

And set:

hw.usb.uhub.debug=16

In /boot/loader.conf (I think that will work)

Then reboot and capture all messages (dmesg), and then run that usbconfig command I gave you, if no USB mouse and keyboard shows up.

I think we are seeing some kind of timing race, with regards to enumerating the USB 2.0 HUB there. I'm not sure why.

Thank you!

--HPS
Comment 44 Emanuel Haupt freebsd_committer freebsd_triage 2022-05-02 11:18:28 UTC
dmesg -a:

https://critical.ch/people/262882/dmesg_a

/var/log/messages before usb reset:

https://critical.ch/people/262882/messages-before-reset

/var/log/messages after usb reset:

https://critical.ch/people/262882/messages-after-reset
Comment 45 Emanuel Haupt freebsd_committer freebsd_triage 2022-05-02 11:54:10 UTC
Here is dmesg -a with a higher kern.msgbufsize:

https://critical.ch/people/262882/dmesg_a_big_kern_msgbufsize
Comment 46 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-02 12:49:51 UTC
Created attachment 233659 [details]
Patch to test

Hi Emmanuel,

Can you test this patch?

Please provide USB HUB debug messages in either case it works or not.

--HPS
Comment 47 Emanuel Haupt freebsd_committer freebsd_triage 2022-05-02 14:04:47 UTC
Unfortunately with the patch the boot process loops forever and never gets to a login prompt.
Comment 48 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-02 14:11:00 UTC
Created attachment 233669 [details]
Patch to test (v2)

Can you try this new patch aswell?
Comment 49 Emanuel Haupt freebsd_committer freebsd_triage 2022-05-02 14:28:34 UTC
Same with this patch. It loops forever.
Comment 50 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-02 15:26:49 UTC
And you reverted the previous one I guess.

I'll need to think a bit more about this.

It is possible for the kernel to reset the "Virtual HUB", but then I need a failsafe test which doesn't cause the looping.

I looks to me like trying to touch any of ports w/o the USB reset is a no-go.

Basically the USB port status is saying there is a device there.

It would be very nice to see some looping debug prints somehow.

--HPS
Comment 51 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-02 15:33:06 UTC
Dave:

Can you try the same thing?

usbconfig show_ifdrv

Then figure out where (ugenX.Y) the uhub<N> for the keyboard is located and reset it using:

usbconfig -d ugenX.Y reset

Does it help?

--HPS
Comment 52 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-02 16:51:55 UTC
Dave:

Does loading:

/boot/kernel/uacpi.ko

from the loader make any changes?

--HPS
Comment 53 Emanuel Haupt freebsd_committer freebsd_triage 2022-05-02 17:04:48 UTC
(In reply to Hans Petter Selasky from comment #50)

> And you reverted the previous one I guess.

Correct.

> It would be very nice to see some looping debug prints somehow.

Would making a video with my phone of the scrolling messages help?
Comment 54 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-02 17:06:56 UTC
> Would making a video with my phone of the scrolling messages help?

Yes, you can send it to me privately if you like:

hselasky@freebsd.org

Try to get it from the start.

--HPS
Comment 55 Glen Barber freebsd_committer freebsd_triage 2022-05-02 19:19:37 UTC
How widespread is this issue?
Comment 56 Emanuel Haupt freebsd_committer freebsd_triage 2022-05-03 07:15:52 UTC
(In reply to Hans Petter Selasky from comment #54)

> Yes, you can send it to me privately if you like:

Sent.
Comment 57 Emanuel Haupt freebsd_committer freebsd_triage 2022-05-03 07:28:58 UTC
(In reply to Glen Barber from comment #55)

> How widespread is this issue?

It's hard to tell. I am using a fairly high quality KVM switch (https://store.level1techs.com/products/14-kvm-switch-dual-monitor-2computer).

I currently do not have another USB3 hub that I could use to test.

It might be prudent to revert 245d5a65f5805864881e2601190e7783057d2768 for the upcoming release.
Comment 58 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-03 07:36:58 UTC
Created attachment 233691 [details]
Patch to test (v3)

Emmanuel,

Can you test this patch?

--HPS
Comment 59 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-03 07:50:18 UTC
> It might be prudent to revert 245d5a65f5805864881e2601190e7783057d2768 for the upcoming release.

Maybe from the release branch for now and leave it in -stable? I won't object to that, but let's see first if this issue is fixable, because the patch I made really fixes an issue, and the old behaviour is not that desired with regards to mass storage.

Making one fix makes another issue pop up! How fun :-)

--HPS
Comment 60 Mark Millard 2022-05-03 08:02:43 UTC
(In reply to Glen Barber from comment #55)

I walked down to the ThreadRipper 1950X system and plugged
in a RPi USB keyboard into a USB3 port and got:

ugen2.2: <vendor 0x05e3 USB2.0 Hub> at usbus2
uhub6 on uhub1
uhub6: <vendor 0x05e3 USB2.0 Hub, class 9/0, rev 2.00/32.98, addr 1> on usbus2
uhub6: MTT enabled
uhub_attach: port 1 power on or off failed, USB_ERR_IOERROR
uhub_attach: port 2 power on or off failed, USB_ERR_IOERROR
uhub_attach: port 3 power on or off failed, USB_ERR_IOERROR
uhub_attach: port 4 power on or off failed, USB_ERR_IOERROR
uhub6: 4 ports with 4 removable, self powered
uhub_reattach_port: device problem (USB_ERR_IOERROR), disabling port 1
uhub_reattach_port: device problem (USB_ERR_IOERROR), disabling port 2
uhub_reattach_port: device problem (USB_ERR_IOERROR), disabling port 3
uhub_reattach_port: device problem (USB_ERR_IOERROR), disabling port 4
ugen2.2: <vendor 0x05e3 USB2.0 Hub> at usbus2 (disconnected)
uhub6: at uhub1, port 1, addr 1 (disconnected)
uhub6: detached
ugen2.2: <vendor 0x05e3 USB2.0 Hub> at usbus2
uhub6 on uhub1
uhub6: <vendor 0x05e3 USB2.0 Hub, class 9/0, rev 2.00/32.98, addr 1> on usbus2
uhub6: MTT enabled
uhub6: 4 ports with 4 removable, self powered
ugen2.3: <vendor 0x04d9 RPI Wired Keyboard 4> at usbus2
ukbd2 on uhub6
ukbd2: <vendor 0x04d9 RPI Wired Keyboard 4, class 0/0, rev 2.00/1.40, addr 2> on usbus2
kbd4 at ukbd2
uhid1 on uhub6
uhid1: <vendor 0x04d9 RPI Wired Keyboard 4, class 0/0, rev 2.00/1.40, addr 2> on usbus2

(The keyboard I normally use is not a USB one.)
Comment 61 Emanuel Haupt freebsd_committer freebsd_triage 2022-05-03 08:07:17 UTC
(In reply to Hans Petter Selasky from comment #58)

Your patch (v3) does not apply. Do I have to apply it on top of v2?
Comment 62 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-03 08:08:01 UTC
Glen:

The initial issue was reported on RC1 and the patch was made on RC5, so reverting won't solve this one.

--HPS
Comment 63 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-03 08:12:06 UTC
Created attachment 233692 [details]
Patch to test (v3 for 13-stable)
Comment 64 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-03 08:13:39 UTC
Emmanuel:

Your patch (v3) does not apply. Do I have to apply it on top of v2?

No, clean 13-stable.

I patched it on -14 and realized that there is a commit missing, which is not relevant to this bug.

I looked at your video, but can't see clearly where it is going wrong, mostly because it is scrolling very fast.
Comment 65 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-03 08:14:39 UTC
Emmanuel: Are you on IRC or slack?

--HPS
Comment 66 Mark Millard 2022-05-03 08:19:28 UTC
(In reply to Mark Millard from comment #60)

That was main [so: 14]:
(output line split some for better readability)

# uname -apKU
FreeBSD amd64_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #34
main-n255108-9fb40baf6043-dirty: Thu Apr 28 19:42:46 PDT 2022
root@amd64_ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG
amd64 amd64 1400057 1400057

Other keyboards with hubs got similar. One keyboard has no hub
and it got no odd messages.

For reference, the RPi keyboard plugged into a RPi4B got
normal output:

ugen0.5: <vendor 0x05e3 USB2.0 Hub> at usbus0
uhub2 on uhub1
uhub2: <vendor 0x05e3 USB2.0 Hub, class 9/0, rev 2.00/32.98, addr 4> on usbus0
uhub2: MTT enabled
uhub2: 4 ports with 4 removable, self powered
ugen0.6: <vendor 0x04d9 RPI Wired Keyboard 4> at usbus0
ukbd0 on uhub2
ukbd0: <vendor 0x04d9 RPI Wired Keyboard 4, class 0/0, rev 2.00/1.40, addr 5> on usbus0
kbd1 at ukbd0
uhid0 on uhub2
uhid0: <vendor 0x04d9 RPI Wired Keyboard 4, class 0/0, rev 2.00/1.40, addr 5> on usbus0

# uname -apKU
FreeBSD CA72_UFS 14.0-CURRENT FreeBSD 14.0-CURRENT #41
main-n255108-9fb40baf6043-dirty: Thu Apr 28 20:43:22 PDT 2022
root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72
arm64 aarch64 1400057 1400057
Comment 67 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-03 08:23:04 UTC
Hi Mark:

Does applying this patch on 14-current:
https://bugs.freebsd.org/bugzilla/attachment.cgi?id=233691&action=diff

Make those USB_ERR_IOERROR go away on your threadripper?

--HPS
Comment 68 Mark Millard 2022-05-03 09:09:24 UTC
(In reply to Hans Petter Selasky from comment #67)

Sorry it took so long to get back to this.

The patched kernel results in the output for the RPi keyboard:

ugen2.2: <vendor 0x05e3 USB2.0 Hub> at usbus2
uhub6 on uhub1
uhub6: <vendor 0x05e3 USB2.0 Hub, class 9/0, rev 2.00/32.98, addr 1> on usbus2
uhub6: MTT enabled
uhub6: 4 ports with 4 removable, self powered
ugen2.3: <vendor 0x04d9 RPI Wired Keyboard 4> at usbus2
ukbd2 on uhub6
ukbd2: <vendor 0x04d9 RPI Wired Keyboard 4, class 0/0, rev 2.00/1.40, addr 2> on usbus2
kbd4 at ukbd2
uhid1 on uhub6
uhid1: <vendor 0x04d9 RPI Wired Keyboard 4, class 0/0, rev 2.00/1.40, addr 2> on usbus2

Looks good to me.

On a RPi4B the patched kernel continued to work.
Comment 69 Mark Millard 2022-05-03 09:33:47 UTC
(In reply to Hans Petter Selasky from comment #67)

FYI, I just tried plugging in a USB3 NVMe SSD and got:

uhub_reattach_port: port 2 U1 timeout failed, error=USB_ERR_IOERROR
uhub_reattach_port: port 2 U2 timeout failed, error=USB_ERR_IOERROR
usb_msc_auto_quirk: UQ_MSC_NO_GETMAXLUN set for USB mass storage device Samsung PSSD T7 Touch (0x04e8:0x4001)
ugen0.9: <Samsung PSSD T7 Touch> at usbus0
umass2 on uhub0
umass2: <Samsung PSSD T7 Touch, class 0/0, rev 3.20/1.00, addr 9> on usbus0
umass2:  SCSI over Bulk-Only; quirks = 0x0100
umass2:12:2: Attached to scbus12
da5 at umass-sim2 bus 2 scbus12 target 0 lun 0
da5: <Samsung PSSD T7 Touch 0> Fixed Direct Access SPC-4 SCSI device
da5: Serial Number REPLACED
da5: 400.000MB/s transfers
da5: 953869MB (1953525168 512 byte sectors)
da5: quirks=0x2<NO_6_BYTE>

By contrast the patched RPi4B got only:

usb_msc_auto_quirk: UQ_MSC_NO_GETMAXLUN set for USB mass storage device Samsung PSSD T7 Touch (0x04e8:0x4001)
ugen0.4: <Samsung PSSD T7 Touch> at usbus0
umass1 on uhub0
umass1: <Samsung PSSD T7 Touch, class 0/0, rev 3.20/1.00, addr 3> on usbus0
umass1:  SCSI over Bulk-Only; quirks = 0x0100
umass1:1:1: Attached to scbus1
da1 at umass-sim1 bus 1 scbus1 target 0 lun 0
da1: <Samsung PSSD T7 Touch 0> Fixed Direct Access SPC-4 SCSI device
da1: Serial Number S5K5NJ0R107444J
da1: 400.000MB/s transfers
da1: 953869MB (1953525168 512 byte sectors)
da1: quirks=0x2<NO_6_BYTE>

So the ThreadRipper 1950X gets the extra lines:

uhub_reattach_port: port 2 U1 timeout failed, error=USB_ERR_IOERROR
uhub_reattach_port: port 2 U2 timeout failed, error=USB_ERR_IOERROR
Comment 70 Mark Millard 2022-05-03 09:38:05 UTC
(In reply to Mark Millard from comment #69)

More detail: the 2 extra messages only happen
for plugging into the USB 3.1 ports, not the
USB 3.0 ports. (RPi4B's do not have 3.1.)
Comment 71 Mark Millard 2022-05-03 09:42:54 UTC
(In reply to Hans Petter Selasky from comment #67)

So I tried plugging in the RPi keyboard into a USB 3.1
port and got:

ugen0.9: <vendor 0x05e3 USB2.0 Hub> at usbus0
uhub6 on uhub0
uhub6: <vendor 0x05e3 USB2.0 Hub, class 9/0, rev 2.00/32.98, addr 11> on usbus0
uhub6: MTT enabled
uhub6: 4 ports with 4 removable, self powered
usb_alloc_device: device init 10 failed (USB_ERR_IOERROR, ignored)
ugen0.10: <Unknown > at usbus0 (disconnected)
uhub_reattach_port: could not allocate new device

So: alloc/reattach notices.
Comment 72 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-03 10:48:49 UTC
> So the ThreadRipper 1950X gets the extra lines:
> uhub_reattach_port: port 2 U1 timeout failed, error=USB_ERR_IOERROR
> uhub_reattach_port: port 2 U2 timeout failed, error=USB_ERR_IOERROR

Was this with or without the v3 patch?

--HPS
Comment 73 Nathan Whitehorn freebsd_committer freebsd_triage 2022-05-03 11:46:17 UTC
Patch doesn't help me at all (14-CURRENT), but resetting the USB hub makes the devices appear.
Comment 74 Glen Barber freebsd_committer freebsd_triage 2022-05-03 12:17:53 UTC
(In reply to Hans Petter Selasky from comment #62)
Ok, thank you for the feedback.

I understand there is still some unclear behavior, but what are we looking at, timeframe-wise, for a resolution?  I'm asking because -RC6, which is apparently warranted now, will be built in roughly 36 hours.  So, I need to plan accordingly if it needs to be delayed, or if we will have an -RC7.
Comment 75 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-03 12:29:02 UTC
I need the next 24 hours for debugging at least. I'll keep you posted Glen.

--HPS
Comment 76 Glen Barber freebsd_committer freebsd_triage 2022-05-03 12:36:26 UTC
(In reply to Hans Petter Selasky from comment #75)
Thank you very much.
Comment 77 Mark Millard 2022-05-03 14:15:40 UTC
(In reply to Hans Petter Selasky from comment #72)

The only patch was the main [so: 14] one from:

https://bugs.freebsd.org/bugzilla/attachment.cgi?id=233691&action=diff
Comment 78 Mark Millard 2022-05-03 14:21:22 UTC
(In reply to Mark Millard from comment #77)

Note: At this point I've only had about 1/2 a nights sleep.
So I may not respond quickly.
Comment 79 commit-hook freebsd_committer freebsd_triage 2022-05-03 16:15:18 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=09dd1adfa4c9bb1b49f4ef5524a308732883e132

commit 09dd1adfa4c9bb1b49f4ef5524a308732883e132
Author:     Hans Petter Selasky <hselasky@FreeBSD.org>
AuthorDate: 2022-05-03 16:10:49 +0000
Commit:     Hans Petter Selasky <hselasky@FreeBSD.org>
CommitDate: 2022-05-03 16:13:53 +0000

    xhci(4): Always add and evaluate the slot context.

    Because the maximum number of endpoint contexts is stored there.

    Tested by:      ehaupt@
    PR:             262882
    MFC after:      3 hours
    Sponsored by:   NVIDIA Networking

 sys/dev/usb/controller/xhci.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)
Comment 80 commit-hook freebsd_committer freebsd_triage 2022-05-03 16:15:19 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=e276d281503160ba3648bd394cde95736ee53329

commit e276d281503160ba3648bd394cde95736ee53329
Author:     Hans Petter Selasky <hselasky@FreeBSD.org>
AuthorDate: 2022-05-03 16:09:17 +0000
Commit:     Hans Petter Selasky <hselasky@FreeBSD.org>
CommitDate: 2022-05-03 16:13:53 +0000

    xhci(4): Only drop BULK and INTERRUPT endpoints to reset data toggle.

    Only drop BULK and INTERRUPT endpoints, to reset the data toggle,
    because for other endpoint types this is not critical.

    Tested by:      ehaupt@
    PR:             262882
    MFC after:      3 hours
    Sponsored by:   NVIDIA Networking

 sys/dev/usb/controller/xhci.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)
Comment 81 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-03 16:21:13 UTC
Hi Glen:

After several hours of debugging with Emmanuel today, I found two issues which fixed his USB problems. I plan to merge these as soon as possible to 13-stable and ask for MFC to 13.1 aswell. I will ask the people subscribed there to test those patches. They will apply cleanly on top of 13-stable and 13.1, so just do git cherry-pick xxxxx .

The problems are due to different XHCI firmware designs from what I can see, not so easy to know about.

--HPS
Comment 82 Glen Barber freebsd_committer freebsd_triage 2022-05-03 16:29:32 UTC
(In reply to Hans Petter Selasky from comment #81)
Thank you for the update, and thank you and manu@ for resolving this so quickly.  Let's let this sit in -CURRENT for at least a few hours, or maybe until tomorrow, then please send a request for approval against releng/13.1 to re@.

For the early merge to stable/13, please use 'Approved by: re (gjb, early MFC)' in the commit log.

I would like to hear back from Nathan as well, if this addresses the issue he had hit.
Comment 83 Emanuel Haupt freebsd_committer freebsd_triage 2022-05-03 16:43:17 UTC
(In reply to Glen Barber from comment #82)

> Thank you for the update, and thank you and manu@ for resolving this so quickly.

Did you mean ehaupt@? I'm the other Emanuel with one 'm' :-)

Also, thank you from my side for Hans's tireless help.
Comment 84 Mark Millard 2022-05-03 16:48:42 UTC
(In reply to Hans Petter Selasky from comment #81)

Looks like the patch I tested is not being included. I'll revert
and update.
Comment 85 Glen Barber freebsd_committer freebsd_triage 2022-05-03 16:49:46 UTC
(In reply to Emanuel Haupt from comment #83)
Oops, sorry.  I thought I got all of the people involved.  Sorry.  :(

But yes, thank you as well.
Comment 86 Mark Millard 2022-05-03 17:09:13 UTC
(In reply to Hans Petter Selasky from comment #81)

The ThreadRipper 1905X is now based on main-n255153-7ac164dc8e2e .

The RPi keyboard connections tests are not producing odd messages
on USB3.0 or USB 3.1 ports.

But the USB3 NVMe SSD USB3.1 port connection test is still
producing the reattach timeout failed notices:

uhub_reattach_port: port 2 U1 timeout failed, error=USB_ERR_IOERROR
uhub_reattach_port: port 2 U2 timeout failed, error=USB_ERR_IOERROR
usb_msc_auto_quirk: UQ_MSC_NO_GETMAXLUN set for USB mass storage device Samsung PSSD T7 Touch (0x04e8:0x4001)
ugen0.9: <Samsung PSSD T7 Touch> at usbus0
umass2 on uhub3
umass2: <Samsung PSSD T7 Touch, class 0/0, rev 3.20/1.00, addr 10> on usbus0
umass2:  SCSI over Bulk-Only; quirks = 0x0100
umass2:12:2: Attached to scbus12
da5 at umass-sim2 bus 2 scbus12 target 0 lun 0
da5: <Samsung PSSD T7 Touch 0> Fixed Direct Access SPC-4 SCSI device
da5: Serial Number S5K5NJ0R107157Z
da5: 400.000MB/s transfers
da5: 953869MB (1953525168 512 byte sectors)
da5: quirks=0x2<NO_6_BYTE>
Comment 87 Mark Millard 2022-05-03 17:29:33 UTC
(In reply to Hans Petter Selasky from comment #81)

Not that I expect it fits here, but in the spirit of
reporting all oddities during testing . . .

I got out 2 USB3 media readers. On both the updated
ThreadRipper 1950X and the non-updated HoneyComb
get things like the following when they were plugged
in:

usb_msc_auto_quirk: UQ_MSC_NO_TEST_UNIT_READY set for USB mass storage device Kingston Multi-Reader (0x11b0:0x6368)
usb_msc_auto_quirk: UQ_MSC_NO_PREVENT_ALLOW set for USB mass storage device Kingston Multi-Reader (0x11b0:0x6368)
usb_msc_auto_quirk: UQ_MSC_NO_SYNC_CACHE set for USB mass storage device Kingston Multi-Reader (0x11b0:0x6368)
usb_msc_auto_quirk: UQ_MSC_NO_START_STOP set for USB mass storage device Kingston Multi-Reader (0x11b0:0x6368)
ugen3.3: <Kingston Multi-Reader> at usbus3
umass2 on uhub1
umass2: <Bulk-In, Bulk-Out, Interface> on usbus3
umass2:  SCSI over Bulk-Only; quirks = 0xc005
umass2:12:2: Attached to scbus12
(probe0:umass-sim2:2:0:0): REPORT LUNS. CDB: a0 00 00 00 00 00 00 00 00 10 00 00 
(probe0:umass-sim2:2:0:0): CAM status: SCSI Status Error
(probe0:umass-sim2:2:0:0): SCSI status: Check Condition
(probe0:umass-sim2:2:0:0): SCSI sense: ILLEGAL REQUEST asc:24,0 (Invalid field in CDB)
(probe0:umass-sim2:2:0:0): Error 22, Unretryable error
da5 at umass-sim2 bus 2 scbus12 target 0 lun 0
da5: < Multi-Reader  -0 1.00> Removable Direct Access SPC-4 SCSI device
. . .

and:

umass2: detached
usb_msc_auto_quirk: UQ_MSC_NO_TEST_UNIT_READY set for USB mass storage device Kingston USB3.0 Media Reader (0x11b0:0x6348)
usb_msc_auto_quirk: UQ_MSC_NO_PREVENT_ALLOW set for USB mass storage device Kingston USB3.0 Media Reader (0x11b0:0x6348)
usb_msc_auto_quirk: UQ_MSC_NO_SYNC_CACHE set for USB mass storage device Kingston USB3.0 Media Reader (0x11b0:0x6348)
usb_msc_auto_quirk: UQ_MSC_NO_START_STOP set for USB mass storage device Kingston USB3.0 Media Reader (0x11b0:0x6348)
ugen3.3: <Kingston USB3.0 Media Reader> at usbus3
umass2 on uhub1
umass2: <Bulk-In, Bulk-Out, Interface> on usbus3
umass2:  SCSI over Bulk-Only; quirks = 0xc005
umass2:12:2: Attached to scbus12
(probe0:umass-sim2:2:0:0): REPORT LUNS. CDB: a0 00 00 00 00 00 00 00 00 10 00 00 
(probe0:umass-sim2:2:0:0): CAM status: SCSI Status Error
(probe0:umass-sim2:2:0:0): SCSI status: Check Condition
(probe0:umass-sim2:2:0:0): SCSI sense: ILLEGAL REQUEST asc:24,0 (Invalid field in CDB)
(probe0:umass-sim2:2:0:0): Error 22, Unretryable error
da5 at umass-sim2 bus 2 scbus12 target 0 lun 0
da5: < FCR-HS3       -0 1.00> Removable Direct Access SPC-4 SCSI device
. . .
Comment 88 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-03 19:30:04 UTC
Hi Glen,

I can hold the MFC + re@ e-mail till tomorrow, given that you hold the next 13.1 RC for that. It will be 10 hours from now approx. Then more people can test.

--HPS
Comment 89 Glen Barber freebsd_committer freebsd_triage 2022-05-03 19:33:44 UTC
(In reply to Hans Petter Selasky from comment #88)
Thank you for the update.  Please keep re@ informed (ideally via this ticket or direct email to re@) if anything changes, regresses, etc., and you need more time.

Your help is very much appreciated.
Comment 90 Glen Barber freebsd_committer freebsd_triage 2022-05-03 19:38:46 UTC
(In reply to Hans Petter Selasky from comment #88)
Actually, I think merging to stable/13 early would be good, if you have the time to do so.  Then we can find out if users tracking 13.1-STABLE still hit issues or not, after which we can get this into releng/13.1 before the next RC build (which is now necessary).
Comment 91 commit-hook freebsd_committer freebsd_triage 2022-05-03 19:47:01 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=6d8c6b24ee0a0416204356a98e4e7606489894c5

commit 6d8c6b24ee0a0416204356a98e4e7606489894c5
Author:     Hans Petter Selasky <hselasky@FreeBSD.org>
AuthorDate: 2022-05-03 16:10:49 +0000
Commit:     Hans Petter Selasky <hselasky@FreeBSD.org>
CommitDate: 2022-05-03 19:43:13 +0000

    xhci(4): Always add and evaluate the slot context.

    Because the maximum number of endpoint contexts is stored there.

    Tested by:      ehaupt@
    PR:             262882
    Approved by:    re (gjb, early MFC)
    Sponsored by:   NVIDIA Networking

    (cherry picked from commit 09dd1adfa4c9bb1b49f4ef5524a308732883e132)

 sys/dev/usb/controller/xhci.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)
Comment 92 commit-hook freebsd_committer freebsd_triage 2022-05-03 19:47:03 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=610528736f3f0bf51f990dd93c5061a7a437e519

commit 610528736f3f0bf51f990dd93c5061a7a437e519
Author:     Hans Petter Selasky <hselasky@FreeBSD.org>
AuthorDate: 2022-05-03 16:09:17 +0000
Commit:     Hans Petter Selasky <hselasky@FreeBSD.org>
CommitDate: 2022-05-03 19:41:51 +0000

    xhci(4): Only drop BULK and INTERRUPT endpoints to reset data toggle.

    Only drop BULK and INTERRUPT endpoints, to reset the data toggle,
    because for other endpoint types this is not critical.

    While at it fix some whitespace.

    Tested by:      ehaupt@
    PR:             262882
    Approved by:    re (gjb, early MFC)
    Sponsored by:   NVIDIA Networking

    (cherry picked from commit e276d281503160ba3648bd394cde95736ee53329)

 sys/dev/usb/controller/xhci.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)
Comment 93 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-03 19:47:22 UTC
Glen:

OK. stable/13 is updated now.

--HPS
Comment 94 Glen Barber freebsd_committer freebsd_triage 2022-05-03 19:50:00 UTC
Thank you.  Provided there is no obvious fallout, let's continue with your original plan to include this in releng/13.1 in 10-ish hours.

Please remember to send the request for approval to re@ following the change request guidelines.
Comment 95 Mark Millard 2022-05-03 19:56:14 UTC
(In reply to Mark Millard from comment #86)

The messages that I've reported getting for
USB 3.1 ports when the NVMe SSD USB3 devices
are plugged in:

uhub_reattach_port: port 2 U1 timeout failed, error=USB_ERR_IOERROR
uhub_reattach_port: port 2 U2 timeout failed, error=USB_ERR_IOERROR

seem to be tied to code that looks like (U1 case shown):

        case C(UR_SET_FEATURE, UT_WRITE_CLASS_OTHER):

                i = index >> 8;
                index &= 0x00FF;

                if ((index < 1) ||
                    (index > sc->sc_noport)) {
                        err = USB_ERR_IOERROR;
                        goto done;
                }

                port = XHCI_PORTSC(index);
                v = XREAD4(sc, oper, port) & ~XHCI_PS_CLEAR;

                switch (value) {
                case UHF_PORT_U1_TIMEOUT:
                        if (XHCI_PS_SPEED_GET(v) != 4) {
                                err = USB_ERR_IOERROR;
                                goto done;
                        }

So it seems to be not getting the speed-mode it expects and
it treats that as an error status. I've no clue if the
speed-mode should be guaranteed as the code suggests at the
point of plugging an NMVe SSD into a USB 3.1 port or not.

But it sure looks like a distinct issue from the original
buzilla submittal.
Comment 96 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-03 19:59:02 UTC
Mark:

Could you print XHCI_PS_SPEED_GET(v) ?

Likely it the check should be < 4 .

--HPS
Comment 97 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-03 19:59:56 UTC
s/ it //
Comment 98 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-03 20:25:51 UTC
Created attachment 233705 [details]
Patch to fix U1/U2 IOERROR issue

Mark:

Can you test this patch, and see if the U1/U2 port timeout errors go away?

--HPS
Comment 99 Mark Millard 2022-05-03 21:51:01 UTC
(In reply to Hans Petter Selasky from comment #98)

ThreadRipper 1950X updated to be based on main-n255160-9a3583bfbd17
with the U1/U2 IOERROR related patch:

I got no odd messages from testing this context. Things
look to be working.
Comment 100 Hans Petter Selasky freebsd_committer freebsd_triage 2022-05-04 07:25:28 UTC
*** Bug 263661 has been marked as a duplicate of this bug. ***
Comment 101 commit-hook freebsd_committer freebsd_triage 2022-05-04 07:30:04 UTC
A commit in branch stable/12 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=11a732b280319e2babb5d575d14e89e12127d06a

commit 11a732b280319e2babb5d575d14e89e12127d06a
Author:     Hans Petter Selasky <hselasky@FreeBSD.org>
AuthorDate: 2022-05-03 16:10:49 +0000
Commit:     Hans Petter Selasky <hselasky@FreeBSD.org>
CommitDate: 2022-05-04 07:28:46 +0000

    xhci(4): Always add and evaluate the slot context.

    Because the maximum number of endpoint contexts is stored there.

    Tested by:      ehaupt@
    PR:             262882
    Sponsored by:   NVIDIA Networking

    (cherry picked from commit 09dd1adfa4c9bb1b49f4ef5524a308732883e132)

 sys/dev/usb/controller/xhci.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)
Comment 102 commit-hook freebsd_committer freebsd_triage 2022-05-04 07:30:05 UTC
A commit in branch stable/12 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=473c925e4359f79224374911cdeb1477bf1ef939

commit 473c925e4359f79224374911cdeb1477bf1ef939
Author:     Hans Petter Selasky <hselasky@FreeBSD.org>
AuthorDate: 2022-05-03 16:09:17 +0000
Commit:     Hans Petter Selasky <hselasky@FreeBSD.org>
CommitDate: 2022-05-04 07:28:41 +0000

    xhci(4): Only drop BULK and INTERRUPT endpoints to reset data toggle.

    Only drop BULK and INTERRUPT endpoints, to reset the data toggle,
    because for other endpoint types this is not critical.

    Tested by:      ehaupt@
    PR:             262882
    Sponsored by:   NVIDIA Networking

    (cherry picked from commit e276d281503160ba3648bd394cde95736ee53329)

 sys/dev/usb/controller/xhci.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)
Comment 103 commit-hook freebsd_committer freebsd_triage 2022-05-04 07:31:07 UTC
A commit in branch stable/11 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=a1ec8baee5dd0cb8e344ab0e2feafcf49f4a802a

commit a1ec8baee5dd0cb8e344ab0e2feafcf49f4a802a
Author:     Hans Petter Selasky <hselasky@FreeBSD.org>
AuthorDate: 2022-05-03 16:09:17 +0000
Commit:     Hans Petter Selasky <hselasky@FreeBSD.org>
CommitDate: 2022-05-04 07:30:07 +0000

    xhci(4): Only drop BULK and INTERRUPT endpoints to reset data toggle.

    Only drop BULK and INTERRUPT endpoints, to reset the data toggle,
    because for other endpoint types this is not critical.

    Tested by:      ehaupt@
    PR:             262882
    Sponsored by:   NVIDIA Networking

    (cherry picked from commit e276d281503160ba3648bd394cde95736ee53329)

 sys/dev/usb/controller/xhci.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)
Comment 104 commit-hook freebsd_committer freebsd_triage 2022-05-04 07:31:09 UTC
A commit in branch stable/11 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=cacb5f3ea5d39d9ee02e6f278993fb1b308ca9ba

commit cacb5f3ea5d39d9ee02e6f278993fb1b308ca9ba
Author:     Hans Petter Selasky <hselasky@FreeBSD.org>
AuthorDate: 2022-05-03 16:10:49 +0000
Commit:     Hans Petter Selasky <hselasky@FreeBSD.org>
CommitDate: 2022-05-04 07:30:12 +0000

    xhci(4): Always add and evaluate the slot context.

    Because the maximum number of endpoint contexts is stored there.

    Tested by:      ehaupt@
    PR:             262882
    Sponsored by:   NVIDIA Networking

    (cherry picked from commit 09dd1adfa4c9bb1b49f4ef5524a308732883e132)

 sys/dev/usb/controller/xhci.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)
Comment 105 Mark Millard 2022-05-04 10:50:52 UTC
(In reply to Mark Millard from comment #99)

I have updated the ThreadRipper 1950X bectl environment
for main to:

# ~/fbsd-based-on-what-commit.sh -C /usr/main-src/
branch: main
merge-base: a1c0442b418b39e57d287750147b0aeae5140766
merge-base: CommitDate: 2022-05-04 07:26:39 +0000
a1c0442b418b (HEAD -> main, freebsd/main, freebsd/HEAD) xhci(4): Tweak USB port speed checks to allow newer super speed generations.
n255163 (--first-parent --count for merge-base)

in order to to pick up the commits for the U1/U2 IOERROR
issue.

(Looks like it will be a week or so for the stable/13
context to have commits available.)
Comment 106 commit-hook freebsd_committer freebsd_triage 2022-05-04 12:27:59 UTC
A commit in branch releng/13.1 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=340ed8ccb576e74e0cc8e5f1e8e3bbabbe53f090

commit 340ed8ccb576e74e0cc8e5f1e8e3bbabbe53f090
Author:     Hans Petter Selasky <hselasky@FreeBSD.org>
AuthorDate: 2022-05-03 16:09:17 +0000
Commit:     Hans Petter Selasky <hselasky@FreeBSD.org>
CommitDate: 2022-05-04 07:20:46 +0000

    xhci(4): Only drop BULK and INTERRUPT endpoints to reset data toggle.

    Only drop BULK and INTERRUPT endpoints, to reset the data toggle,
    because for other endpoint types this is not critical.

    While at it fix some whitespace.

    Tested by:      ehaupt@
    PR:             262882
    Approved by:    re (gjb, early MFC)
    Sponsored by:   NVIDIA Networking

    (cherry picked from commit e276d281503160ba3648bd394cde95736ee53329)
    (cherry picked from commit 610528736f3f0bf51f990dd93c5061a7a437e519)

 sys/dev/usb/controller/xhci.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)
Comment 107 commit-hook freebsd_committer freebsd_triage 2022-05-04 12:28:01 UTC
A commit in branch releng/13.1 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=465c5bd88e64852b39d711ea3e565edc90d65210

commit 465c5bd88e64852b39d711ea3e565edc90d65210
Author:     Hans Petter Selasky <hselasky@FreeBSD.org>
AuthorDate: 2022-05-03 16:10:49 +0000
Commit:     Hans Petter Selasky <hselasky@FreeBSD.org>
CommitDate: 2022-05-04 07:20:54 +0000

    xhci(4): Always add and evaluate the slot context.

    Because the maximum number of endpoint contexts is stored there.

    Tested by:      ehaupt@
    PR:             262882
    Approved by:    re (gjb, early MFC)
    Sponsored by:   NVIDIA Networking

    (cherry picked from commit 09dd1adfa4c9bb1b49f4ef5524a308732883e132)
    (cherry picked from commit 6d8c6b24ee0a0416204356a98e4e7606489894c5)

 sys/dev/usb/controller/xhci.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)
Comment 108 Mark Millard 2022-05-04 21:36:44 UTC
(In reply to Hans Petter Selasky from comment #98)

Hi Hans,

Thanks for all the bug fixes.

Mark.
Comment 109 Tomasz "CeDeROM" CEDRO 2022-05-08 11:29:52 UTC
Thank You HPS!! :-)
Comment 110 Daniel Ebdrup Jensen freebsd_committer freebsd_triage 2022-05-11 20:25:46 UTC
I'm seeing the same behavior on my T480s with a ThinkPad Thunderbolt 3 Dock Gen 2.

It's running a very new 14-CURRENT:
FreeBSD geroi 14.0-CURRENT FreeBSD 14.0-CURRENT #1 main-n255577-586ed321068: Wed May 11 19:48:41 CEST 2022     debdrup@geroi:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64.

I'll be attaching a log file with USB_DEBUG in GENERIC (it's there by default), and hw.usb.debug=1, in the hopes that can help shed some light on things.

I don't mind testing patches, if it helps this get resolved. :)
Comment 111 Daniel Ebdrup Jensen freebsd_committer freebsd_triage 2022-05-11 20:28:07 UTC
Whoops, I hadn't realised there was a limit for attachments on BugZilla, so the log file is here instead:

https://people.freebsd.org/~debdrup/usb-hub-falloff.txt
Comment 112 Mark Millard 2022-05-12 00:41:51 UTC
(In reply to Daniel Ebdrup Jensen from comment #110)

It may be that anything involving Thunderbolt capable
hardware is considered a separate type of issue. It
might also be that anything involving Thunderbolt
capable hardware that appears to work is not by design
at this point for FreeBSD.

In other words, so far as I know, FreeBSD does not
claim to support involving Thunderbolt capable
hardware at this time, not that I'm an expert on
such FreeBSD issues or anything.
Comment 113 Daniel Ebdrup Jensen freebsd_committer freebsd_triage 2022-05-12 06:21:04 UTC
(In reply to Mark Millard from comment #112)

Thunderbolt, in so far as presenting a display display device, works (both according to the log mentioned above, and Wayland/Sway automatically outputting to it when the dock is connected):
May 11 22:15:09 geroi kernel: [352.126842] <6>[drm] Connector DP-3: get mode from tunables:
May 11 22:15:09 geroi kernel: [352.126874] <6>[drm]   - kern.vt.fb.modes.DP-3
May 11 22:15:09 geroi kernel: [352.126897] <6>[drm]   - kern.vt.fb.default_mode
May 11 22:15:09 geroi kernel: [352.127023] <6>[drm] Connector DP-4: get mode from tunables:
May 11 22:15:09 geroi kernel: [352.127048] <6>[drm]   - kern.vt.fb.modes.DP-4
May 11 22:15:09 geroi kernel: [352.127066] <6>[drm]   - kern.vt.fb.default_mode

The log is also full of USB device attaches, so clearly that part is working too. I've also had several other FreeBSD developers tell me that USB and video works via thunderbolt.

It should also be mentioned that sometimes USB devices briefly appear to work (ie. I can move the mouse and type on the keyboard, but stops working when the USB hub disconnects), but I can't find a consistent pattern in making it work for a brief amount of time, so it's hard to replicate.
During one of these times when it was working for a little while, I was even able to set hw.snd.default_unit=3 and get audio playing on the speakers connected to the dock.

Question for those that know better than I: Does it matter that it's a series of chained hubs, and/or could this be part of the problem?
Comment 114 Mark Millard 2022-05-12 09:10:07 UTC
(In reply to Daniel Ebdrup Jensen from comment #113)

I was implicitly assuming USB4/Thunderbolt4, where
the likes of USB3.2 is tunneled, rather than direct,
if I understand right. Nothing analogous to
DisplayPort Alt Mode for USB3.2 so far as I can
tell.

It may well be that one needs to make clear distinctions
about which one(s) of Thunderbolt 1, 2, 3, or 4 is
expected as being supported or under discussion. I'm
not sure generic "Thunderbolt" references work all that
well. Sorry for not being more explicit (even if I
was wrong anyway).

If I understand right, Thunderbolt 3 is like Thunderbolt 4
for the likes of USB 3.2 (well, at the time 3.1): tunneled.

But all of this is just from a little reading, not any
implementation involvement. I've been guessing that
various issues would be visible to FreeBSD and have to be
managed somewhat explicitly for Thunderbolt 3, 4, and
USB4 relative to handling the likes of USB3.2 . But
I'd be happy to be guessing incorrectly.
Comment 115 Ivan 2022-07-21 19:05:03 UTC
Hello everyone.

Do you think my problem is related to this bug?
My usb driver is ehci, not xhci 

13.0-p11 running on Dell r720xd.
APC ups connected via usb cable.

During boot this device always fail to init:

=====
uhub4 numa-domain 0 on uhub2
uhub4: <vendor 0x0424 product 0x2512, class 9/0, rev 2.00/b.b3, addr 3> on usbus0
uhub4: MTT enabled
uhub4: 1 port with 1 removable, self powered
usb_alloc_device: set address 4 failed (USB_ERR_STALLED, ignored)
usbd_setup_device_desc: getting device descriptor at addr 4 failed, USB_ERR_STALLED
usbd_req_re_enumerate: addr=4, set address failed! (USB_ERR_STALLED, ignored)
usbd_setup_device_desc: getting device descriptor at addr 4 failed, USB_ERR_STALLED
usbd_req_re_enumerate: addr=4, set address failed! (USB_ERR_STALLED, ignored)
usbd_setup_device_desc: getting device descriptor at addr 4 failed, USB_ERR_STALLED
usbd_req_re_enumerate: addr=4, set address failed! (USB_ERR_STALLED, ignored)
usbd_setup_device_desc: getting device descriptor at addr 4 failed, USB_ERR_STALLED
usbd_req_re_enumerate: addr=4, set address failed! (USB_ERR_STALLED, ignored)
Root mount waiting for: usbus0
usbd_setup_device_desc: getting device descriptor at addr 4 failed, USB_ERR_STALLED
ugen0.4: <Unknown > at usbus0 (disconnected)
uhub_reattach_port: could not allocate new device
Root mount waiting for: usbus0
ugen0.4: <no manufacturer Gadget USB HUB> at usbus0
=====

But after boot, if i re-insert the cable, it works ok:

====
ugen0.7: <American Power Conversion Back-UPS CS 650   FW:915.R1 .I USB FW:R1> at usbus0
====
Comment 116 Hans Petter Selasky freebsd_committer freebsd_triage 2022-07-21 20:54:59 UTC
Can you try a 13-stable kernel aswell?

# Use:

usbconfig -d X.Y reset

# to reset the parent USB HUB, to see if it gets recognized.

--HPS
Comment 117 Ivan 2022-07-21 21:00:09 UTC
(In reply to Hans Petter Selasky from comment #116)

No, i'm planning to try 13.1. Does that make sense? It is production server and I'm afraid to use non-release branches on it.

Also, i see in the discussion that the problem may be related to the non-standard speed of the device. This device have a non-standard speed as well.

ugen0.7: <American Power Conversion Back-UPS CS 650   FW:915.R1 .I USB FW:R1> at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (24mA)
Comment 118 Ivan 2022-07-21 21:03:50 UTC
(In reply to Hans Petter Selasky from comment #116)
I'm sorry, I misread your comment. I will try tomorrow 13-stable. I meant that I won't be able to put 13-stable as the main system
Comment 119 Hans Petter Selasky freebsd_committer freebsd_triage 2022-07-21 21:12:17 UTC
Ivan:

Just build and install a 13-stable kernel as of today. Reboot and see what happens. Then you can restore the old kernel from /boot/kernel.old . Or copy it somewhere else to be safe. There are also some debug symbols /usr/lib/debug/boot/ which might need to be restored the same way.

--HPS
Comment 120 Ivan 2022-07-22 08:20:10 UTC
(In reply to Hans Petter Selasky from comment #119)

building world&kernel

But this morning I rebooted the system without changing anything, and it worked during the boot.
Comment 121 Ivan 2022-07-22 15:09:29 UTC
(In reply to Hans Petter Selasky from comment #119)

I can't reproduce it anymore. 
It failed to work consistently for at least a few months after each reboot. 
Now two reboots works fine. Nothing changed in system config.
Thank you for this magic :)
Comment 122 Graham Perrin freebsd_committer freebsd_triage 2022-12-29 16:09:43 UTC
Triage: de-tag the summary line.

<https://wiki.freebsd.org/Bugzilla/DosAndDonts>
Comment 123 Mark Linimon freebsd_committer freebsd_triage 2023-07-25 18:26:13 UTC
To dch: is this still a problem?