Bug 213877

Summary: xhci reset causes panic on SuperMicro A1SRi-2758F board
Product: Base System Reporter: Adam Baxter <voltagex>
Component: usbAssignee: freebsd-usb mailing list <usb>
Status: Open ---    
Severity: Affects Only Me CC: bz, hselasky, voltagex
Priority: --- Keywords: crash, needs-qa
Version: 11.0-RELEASEFlags: koobs: mfc-stable9?
koobs: mfc-stable10?
koobs: mfc-stable11?
Hardware: Any   
OS: Any   
Attachments:
Description Flags
Panic - shown via IPMI console
none
USB disconnect/reset just before panic
none
dmesg boot
none
pciconfig
none
usbconfig
none
second crash part 1
none
second crash, part 1
none
second crash, part 2
none
second crash, part 3 none

Description Adam Baxter 2016-10-29 12:43:05 UTC
Created attachment 176266 [details]
Panic - shown via IPMI console

Hi all,
My system panics after approximately 2 hours of having an Android phone plugged in doing USB tethering.

As far as I can tell I get an xhci reset and then the system panics because da0 also happens to be on that bus.
Comment 1 Adam Baxter 2016-10-29 12:46:16 UTC
Created attachment 176267 [details]
USB disconnect/reset just before panic
Comment 2 Adam Baxter 2016-10-29 12:51:25 UTC
Created attachment 176268 [details]
dmesg boot
Comment 3 Adam Baxter 2016-10-29 12:52:52 UTC
Created attachment 176269 [details]
pciconfig
Comment 4 Adam Baxter 2016-10-29 12:55:01 UTC
Created attachment 176270 [details]
usbconfig
Comment 5 Hans Petter Selasky freebsd_committer 2016-10-29 15:30:12 UTC
Hi,

Is this reproducible?

Have you tried to reproduce on an 11-stable kernel?

Basically the driver is detecting that the USB HC is not responding to a firmware command and tries to reset the XHCI which detaches all devices. As a natural cause the root file system which is da0 also disappears and a panic is expected.

--HPS
Comment 6 Adam Baxter 2016-11-01 13:24:21 UTC
(In reply to Hans Petter Selasky from comment #5)
Hi Hans,
I'm trying to reproduce it, however this is difficult as it takes 2-16 hours for the bug to occur. Strangely, turning off soft journaling on the root filesystem allows the system to run for a lot longer before crashing.

I've switched to another RNDIS device so that I can test for longer periods of time. I also may try to get the serial console going over the weekend so I don't have to use a screen recorder to get the trace.

To be honest, it's unlikely I'll be able to get to a root cause for this issue - the setup I have is only while my main internet connection is down, awaiting repairs.
Comment 7 Adam Baxter 2016-11-02 13:42:13 UTC
Created attachment 176428 [details]
second crash part 1

Second crash caught with soft journalling disabled.
Comment 8 Adam Baxter 2016-11-02 13:45:18 UTC
Created attachment 176429 [details]
second crash, part 1
Comment 9 Adam Baxter 2016-11-02 13:45:51 UTC
Created attachment 176430 [details]
second crash, part 2
Comment 10 Adam Baxter 2016-11-02 13:46:48 UTC
Created attachment 176431 [details]
second crash, part 3
Comment 11 Bjoern A. Zeeb freebsd_committer 2018-08-30 08:58:47 UTC
I keep getting this almost every night as of recently when trying to move data to an external disk:

xhci0: Resetting controller^M^M
(da1:umass-sim1:1:0:0): WRITE(10). CDB: 2a 00 00 00 e1 d0 00 00 10 00 ^M^M
(da1:umass-sim1:1:0:0): CAM status: CCB request completed with an error^M^M
(da1:umass-sim1:1:0:0): Retrying command^M^M
(da1:umass-sim1:1:0:0): WRITE(10). CDB: 2a 00 00 00 e1 d0 00 00 10 00 ^M^M
(da1:umass-sim1:1:0:0): CAM status: CCB request completed with an error^M^M
(da1:umass-sim1:1:0:0): Retrying command^M^M
uhub1: at usbus1, port 1, addr 1 (disconnected)^M^M
ugen1.2: <Seagate Backup+ Hub> at usbus1 (disconnected)^M^M
uhub2: at uhub1, port 2, addr 1 (disconnected)^M^M
ugen1.3: <Seagate Backup+  Desk> at usbus1 (disconnected)^M^M
umass1: at uhub1, port 5, addr 2 (disconnected)^M^M
(da1:umass-sim1:1:0:0): WRITE(10). CDB: 2a 00 00 00 e1 d0 00 00 10 00 ^M^M
(da1:umass-sim1:1:0:0): CAM status: CCB request completed with an error^M^M
(da1:umass-sim1:1:0:0): Retrying command^M^M
da1 at umass-sim1 bus 1 scbus8 target 0 lun 0^M^M
...


Neither the plugged in media nor the kernel have changed in a few weeks however, so not sure what triggered it.

The follow-up is a geom/file system panic.

Is there a way to find out why xhci is resetting?
Comment 12 Hans Petter Selasky freebsd_committer 2018-08-30 10:06:43 UTC
> Is there a way to find out why xhci is resetting?

Try to set:
hw.usb.xhci.debug=16

Or inspect the XHCI code for the reset call. There are a few cases currently for reset.

--HPS