Bug 213877 - xhci reset causes panic on SuperMicro A1SRi-2758F board
Summary: xhci reset causes panic on SuperMicro A1SRi-2758F board
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: usb (show other bugs)
Version: 11.0-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-usb mailing list
URL:
Keywords: crash, needs-qa
Depends on:
Blocks:
 
Reported: 2016-10-29 12:43 UTC by Adam Baxter
Modified: 2018-08-30 10:06 UTC (History)
3 users (show)

See Also:
koobs: mfc-stable9?
koobs: mfc-stable10?
koobs: mfc-stable11?


Attachments
Panic - shown via IPMI console (95.20 KB, image/png)
2016-10-29 12:43 UTC, Adam Baxter
no flags Details
USB disconnect/reset just before panic (95.85 KB, image/png)
2016-10-29 12:46 UTC, Adam Baxter
no flags Details
dmesg boot (11.26 KB, text/plain)
2016-10-29 12:51 UTC, Adam Baxter
no flags Details
pciconfig (4.31 KB, text/plain)
2016-10-29 12:52 UTC, Adam Baxter
no flags Details
usbconfig (654 bytes, text/plain)
2016-10-29 12:55 UTC, Adam Baxter
no flags Details
second crash part 1 (123.97 KB, image/png)
2016-11-02 13:42 UTC, Adam Baxter
no flags Details
second crash, part 1 (118.47 KB, image/png)
2016-11-02 13:45 UTC, Adam Baxter
no flags Details
second crash, part 2 (123.97 KB, image/png)
2016-11-02 13:45 UTC, Adam Baxter
no flags Details
second crash, part 3 (114.12 KB, image/png)
2016-11-02 13:46 UTC, Adam Baxter
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Adam Baxter 2016-10-29 12:43:05 UTC
Created attachment 176266 [details]
Panic - shown via IPMI console

Hi all,
My system panics after approximately 2 hours of having an Android phone plugged in doing USB tethering.

As far as I can tell I get an xhci reset and then the system panics because da0 also happens to be on that bus.
Comment 1 Adam Baxter 2016-10-29 12:46:16 UTC
Created attachment 176267 [details]
USB disconnect/reset just before panic
Comment 2 Adam Baxter 2016-10-29 12:51:25 UTC
Created attachment 176268 [details]
dmesg boot
Comment 3 Adam Baxter 2016-10-29 12:52:52 UTC
Created attachment 176269 [details]
pciconfig
Comment 4 Adam Baxter 2016-10-29 12:55:01 UTC
Created attachment 176270 [details]
usbconfig
Comment 5 Hans Petter Selasky freebsd_committer 2016-10-29 15:30:12 UTC
Hi,

Is this reproducible?

Have you tried to reproduce on an 11-stable kernel?

Basically the driver is detecting that the USB HC is not responding to a firmware command and tries to reset the XHCI which detaches all devices. As a natural cause the root file system which is da0 also disappears and a panic is expected.

--HPS
Comment 6 Adam Baxter 2016-11-01 13:24:21 UTC
(In reply to Hans Petter Selasky from comment #5)
Hi Hans,
I'm trying to reproduce it, however this is difficult as it takes 2-16 hours for the bug to occur. Strangely, turning off soft journaling on the root filesystem allows the system to run for a lot longer before crashing.

I've switched to another RNDIS device so that I can test for longer periods of time. I also may try to get the serial console going over the weekend so I don't have to use a screen recorder to get the trace.

To be honest, it's unlikely I'll be able to get to a root cause for this issue - the setup I have is only while my main internet connection is down, awaiting repairs.
Comment 7 Adam Baxter 2016-11-02 13:42:13 UTC
Created attachment 176428 [details]
second crash part 1

Second crash caught with soft journalling disabled.
Comment 8 Adam Baxter 2016-11-02 13:45:18 UTC
Created attachment 176429 [details]
second crash, part 1
Comment 9 Adam Baxter 2016-11-02 13:45:51 UTC
Created attachment 176430 [details]
second crash, part 2
Comment 10 Adam Baxter 2016-11-02 13:46:48 UTC
Created attachment 176431 [details]
second crash, part 3
Comment 11 Bjoern A. Zeeb freebsd_committer 2018-08-30 08:58:47 UTC
I keep getting this almost every night as of recently when trying to move data to an external disk:

xhci0: Resetting controller^M^M
(da1:umass-sim1:1:0:0): WRITE(10). CDB: 2a 00 00 00 e1 d0 00 00 10 00 ^M^M
(da1:umass-sim1:1:0:0): CAM status: CCB request completed with an error^M^M
(da1:umass-sim1:1:0:0): Retrying command^M^M
(da1:umass-sim1:1:0:0): WRITE(10). CDB: 2a 00 00 00 e1 d0 00 00 10 00 ^M^M
(da1:umass-sim1:1:0:0): CAM status: CCB request completed with an error^M^M
(da1:umass-sim1:1:0:0): Retrying command^M^M
uhub1: at usbus1, port 1, addr 1 (disconnected)^M^M
ugen1.2: <Seagate Backup+ Hub> at usbus1 (disconnected)^M^M
uhub2: at uhub1, port 2, addr 1 (disconnected)^M^M
ugen1.3: <Seagate Backup+  Desk> at usbus1 (disconnected)^M^M
umass1: at uhub1, port 5, addr 2 (disconnected)^M^M
(da1:umass-sim1:1:0:0): WRITE(10). CDB: 2a 00 00 00 e1 d0 00 00 10 00 ^M^M
(da1:umass-sim1:1:0:0): CAM status: CCB request completed with an error^M^M
(da1:umass-sim1:1:0:0): Retrying command^M^M
da1 at umass-sim1 bus 1 scbus8 target 0 lun 0^M^M
...


Neither the plugged in media nor the kernel have changed in a few weeks however, so not sure what triggered it.

The follow-up is a geom/file system panic.

Is there a way to find out why xhci is resetting?
Comment 12 Hans Petter Selasky freebsd_committer 2018-08-30 10:06:43 UTC
> Is there a way to find out why xhci is resetting?

Try to set:
hw.usb.xhci.debug=16

Or inspect the XHCI code for the reset call. There are a few cases currently for reset.

--HPS