Bug 176417 - [xhci][cam][umass] kernelpanic while removing plugged in disk
Summary: [xhci][cam][umass] kernelpanic while removing plugged in disk
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: usb (show other bugs)
Version: 9.0-RELEASE
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-usb mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-02-25 11:20 UTC by Wouter Oosterveld
Modified: 2019-01-09 19:22 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Wouter Oosterveld 2013-02-25 11:20:00 UTC
Symptoms:

Messages of USB disk getting removed and rediscovered while still plugged in. 

Kernelpanic while 'removing' device.

Hardware:

Sweex 2 Port USB 3.0 Card PCI Express (cheap)
Chip reads D720200F1 / 1119KU603 / Japan

Western Digital "My Book Essential" with large >1TB disk.

Screenshot: https://www.dropbox.com/s/bol6fxl37e7mu3g/IMAG0478.jpg

Note:

Machine is a ZFS fileserver with 16TB raidz2 array.

How-To-Repeat: Plug in disk and see it getting 'removed' and rediscovered while being plugged in.
Comment 1 Hans Petter Selasky 2013-02-25 11:33:30 UTC
Hi,

This does not look like an USB issue. It is related to CAM/SCSI layer. mav @ 
CC'ed

--HPS

On Monday 25 February 2013 12:19:18 Wouter Oosterveld wrote:
> >Number:         176417
> >Category:       usb
> >Synopsis:       [xhci][cam][umass] kernelpanic while removing plugged in
> >disk Confidential:   no
> >Severity:       non-critical
> >Priority:       low
> >Responsible:    freebsd-usb
> >State:          open
> >Quarter:
> >Keywords:
> >Date-Required:
> >Class:          sw-bug
> >Submitter-Id:   current-users
> >Arrival-Date:   Mon Feb 25 11:20:00 UTC 2013
> >Closed-Date:
> >Last-Modified:
> >Originator:     Wouter Oosterveld
> >Release:        FreeBSD backup01-trimm.trimm.net 9.0-RELEASE-p3 FreeBSD
> >9.0-RELEASE-p3 #0: Tue Jun 12 02:52:29 UTC 2012    
> >root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64
> 
> >Organization:
> TriMM Multimedia
> 
> >Environment:
> FreeBSD backup01-trimm.trimm.net 9.0-RELEASE-p3 FreeBSD 9.0-RELEASE-p3 #0:
> Tue Jun 12 02:52:29 UTC 2012    
> root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64
> 
> >Description:
> Symptoms:
> 
> Messages of USB disk getting removed and rediscovered while still plugged
> in.
> 
> Kernelpanic while 'removing' device.
> 
> Hardware:
> 
> Sweex 2 Port USB 3.0 Card PCI Express (cheap)
> Chip reads D720200F1 / 1119KU603 / Japan
> 
> Western Digital "My Book Essential" with large >1TB disk.
> 
> Screenshot: https://www.dropbox.com/s/bol6fxl37e7mu3g/IMAG0478.jpg
> 
> Note:
> 
> Machine is a ZFS fileserver with 16TB raidz2 array.
> 
> >How-To-Repeat:
> Plug in disk and see it getting 'removed' and rediscovered while being
> plugged in.
> 
> >Fix:
> >
> >
> >Release-Note:
> >Audit-Trail:
> 
> >Unformatted:
> _______________________________________________
> freebsd-usb@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-usb
> To unsubscribe, send any mail to "freebsd-usb-unsubscribe@freebsd.org"
Comment 2 wouter.oosterveld 2013-02-25 14:09:12 UTC
Please not all this happened while the drive was plugged in at the time.

Is the powersafe feature of USB3.0 not understood (if any) by the kernel or=
 something?

-Wouter

-----Original Message-----
From: Hans Petter Selasky [mailto:hselasky@c2i.net]=20
Sent: maandag 25 februari 2013 12:34
To: freebsd-usb@freebsd.org
Cc: Wouter Oosterveld; freebsd-gnats-submit@freebsd.org; mav@freebsd.org
Subject: Re: usb/176417: [xhci][cam][umass] kernelpanic while removing plug=
ged in disk

Hi,

This does not look like an USB issue. It is related to CAM/SCSI layer. mav =
@ CC'ed

--HPS

On Monday 25 February 2013 12:19:18 Wouter Oosterveld wrote:
> >Number:         176417
> >Category:       usb
> >Synopsis:       [xhci][cam][umass] kernelpanic while removing plugged in
> >disk Confidential:   no
> >Severity:       non-critical
> >Priority:       low
> >Responsible:    freebsd-usb
> >State:          open
> >Quarter:
> >Keywords:
> >Date-Required:
> >Class:          sw-bug
> >Submitter-Id:   current-users
> >Arrival-Date:   Mon Feb 25 11:20:00 UTC 2013
> >Closed-Date:
> >Last-Modified:
> >Originator:     Wouter Oosterveld
> >Release:        FreeBSD backup01-trimm.trimm.net 9.0-RELEASE-p3 FreeBSD
> >9.0-RELEASE-p3 #0: Tue Jun 12 02:52:29 UTC 2012   =20
> >root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC =20
> >amd64
>=20
> >Organization:
> TriMM Multimedia
>=20
> >Environment:
> FreeBSD backup01-trimm.trimm.net 9.0-RELEASE-p3 FreeBSD 9.0-RELEASE-p3 #0=
:
> Tue Jun 12 02:52:29 UTC 2012   =20
> root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64
>=20
> >Description:
> Symptoms:
>=20
> Messages of USB disk getting removed and rediscovered while still=20
> plugged in.
>=20
> Kernelpanic while 'removing' device.
>=20
> Hardware:
>=20
> Sweex 2 Port USB 3.0 Card PCI Express (cheap) Chip reads D720200F1 /=20
> 1119KU603 / Japan
>=20
> Western Digital "My Book Essential" with large >1TB disk.
>=20
> Screenshot: https://www.dropbox.com/s/bol6fxl37e7mu3g/IMAG0478.jpg
>=20
> Note:
>=20
> Machine is a ZFS fileserver with 16TB raidz2 array.
>=20
> >How-To-Repeat:
> Plug in disk and see it getting 'removed' and rediscovered while being=20
> plugged in.
>=20
> >Fix:
> >
> >
> >Release-Note:
> >Audit-Trail:
>=20
> >Unformatted:
> _______________________________________________
> freebsd-usb@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-usb
> To unsubscribe, send any mail to "freebsd-usb-unsubscribe@freebsd.org"
Comment 3 Hans Petter Selasky 2013-02-25 14:19:53 UTC
On Monday 25 February 2013 15:09:12 Wouter Oosterveld wrote:
> Please not all this happened while the drive was plugged in at the time.
> 
> Is the powersafe feature of USB3.0 not understood (if any) by the kernel or
> something?
> 
> -Wouter

How long was the drive plugged when this happened? Usually a sudden disconnect 
means one or more of:

- USB device firmware crashed
- Cable issue
- Power loss or insufficient power

--HPS
Comment 4 wouter.oosterveld 2013-02-25 14:25:47 UTC
>On Monday 25 February 2013 15:09:12 Wouter Oosterveld wrote:
>> Please not all this happened while the drive was plugged in at the time.
>>=20
>> Is the powersafe feature of USB3.0 not understood (if any) by the=20
>> kernel or something?
>>=20
>> -Wouter

>How long was the drive plugged when this happened? Usually a sudden discon=
nect means one or more of:

It was plugged in for 4 days.=20

>- USB device firmware crashed

Could be. We use the same type of drive on Windows Servers. Not seen there =
yet.

Not tested with this specific drive. Will do.

>- Cable issue

Could be.

Not tested will do.

>- Power loss or insufficient power

Drive has its own power. LED is blinking as if in "no connection" or "power=
safe" (conjecture).

>--HPS

-----Original Message-----
From: Hans Petter Selasky [mailto:hselasky@c2i.net]=20
Sent: maandag 25 februari 2013 15:20
To: Wouter Oosterveld
Cc: freebsd-usb@freebsd.org; freebsd-gnats-submit@freebsd.org; mav@freebsd.=
org
Subject: Re: usb/176417: [xhci][cam][umass] kernelpanic while removing plug=
ged in disk

On Monday 25 February 2013 15:09:12 Wouter Oosterveld wrote:
> Please not all this happened while the drive was plugged in at the time.
>=20
> Is the powersafe feature of USB3.0 not understood (if any) by the=20
> kernel or something?
>=20
> -Wouter

How long was the drive plugged when this happened? Usually a sudden disconn=
ect means one or more of:

- USB device firmware crashed
- Cable issue
- Power loss or insufficient power

--HPS
Comment 5 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 08:00:33 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped
Comment 6 Warner Losh freebsd_committer 2019-01-09 19:22:44 UTC
OK. So looking at the stack trace, we get a panic because we're trying to add 'pass16' as a device when there's already a pass16 device.
This appears to be because the CAM probe device thinks it needs to enumerate it, but the reporter said it was being removed.

The actual traceback, in case we lose the dropbox is:

da12: xxx MB (xxx 512 byte sectors: <made up geom>)
ugen3.2: <Western Digital> at usbus3 (disconnected)
umass0: at ushub3 port 1, addr 1 (disconnected)
(da12:umass-sim0:0:0:0): lost device - 1 outstanding
(da12:umass-sim0:0:0:0): outstanding 0
(da12:panic: make_dev_credv: bad si_name (error = 17, si_name=pass16)
umass-sim0:0: cpuid=1
0:KDB: stack backtrace:
0): removing device entry
kdb_backtrace
panic
make_dev_crdev+0x1dc
make_dev_0x6f
passregister+0x1fe
cam_periph_alloc+0x569
passasysnc
...

so we have the intermixed output, strongly suggesting one thread removing while the other thread is adding.

there's been a fair amount of locking tightening that would prevent the refcount from falling to 0, I think, but I can't be sure.

This is purely a CAM problem. I wonder if the original poster can
recreate this issue still, or if we can provoke it somehow.

But, this CAM problem is because of a random failure during discovery where the drive also goes away and we start the teardown, so from that perspective it's also a USB issue maybe.