Bug 13433

Summary: 'camcontrol defects' does not work with IBM drives
Product: Base System Reporter: dkelly <dkelly>
Component: kernAssignee: Kenneth D. Merry <ken>
Status: Closed FIXED    
Severity: Affects Only Me CC: dkelly
Priority: Normal    
Version: 3.2-STABLE   
Hardware: Any   
OS: Any   

Description dkelly 1999-08-28 04:10:01 UTC
Can't read defect lists from IBM SCSI drives.
I don't have non-IBM drives to try.
Same problem with Adaptec and Symbios controllers. /var/log/messages
might have slightly different error codes. See below for Symbios/NCR
example.

Fix: 

I don't have a clue. Purchase non-IBM drives?
How-To-Repeat: 
% su -m
# camcontrol defects -f block
error reading defect list: Input/output error
# 

# tail -5 /var/log/messages
Aug 27 21:28:14 nospam su: dkelly to root on /dev/ttyp1
Aug 27 21:28:33 nospam /kernel: (pass2:ncr0:0:0:0): extraneous data discarded.
Aug 27 21:28:33 nospam /kernel: (pass2:ncr0:0:0:0): COMMAND FAILED (9 80) @0xc0a84400.
Aug 27 21:28:33 nospam /kernel: (pass2:ncr0:0:0:0): extraneous data discarded.
Aug 27 21:28:33 nospam /kernel: (pass2:ncr0:0:0:0): COMMAND FAILED (9 80) @0xc0a84400.
Comment 1 ken 1999-08-28 05:34:40 UTC
dkelly@hiwaay.net wrote...
> 
> >Number:         13433
> >Category:       kern
> >Synopsis:       'camcontrol defects' does not work with IBM drives
> >Confidential:   no
> >Severity:       serious
> >Priority:       medium
> >Responsible:    freebsd-bugs
> >State:          open
> >Quarter:        
> >Keywords:       
> >Date-Required:
> >Class:          sw-bug
> >Submitter-Id:   current-users
> >Arrival-Date:   Fri Aug 27 20:10:01 PDT 1999
> >Closed-Date:
> >Last-Modified:
> >Originator:     David Kelly
> >Release:        FreeBSD 3.2-STABLE i386
> >Organization:
> n/a
> >Environment:
> 
> 	
> 
> >Description:
> 
> Can't read defect lists from IBM SCSI drives.
> I don't have non-IBM drives to try.
> Same problem with Adaptec and Symbios controllers. /var/log/messages
> might have slightly different error codes. See below for Symbios/NCR
> example.
> 
> >How-To-Repeat:
> 
> % su -m
> # camcontrol defects -f block
> error reading defect list: Input/output error
> # 
> 
> # tail -5 /var/log/messages
> Aug 27 21:28:14 nospam su: dkelly to root on /dev/ttyp1
> Aug 27 21:28:33 nospam /kernel: (pass2:ncr0:0:0:0): extraneous data discarded.
> Aug 27 21:28:33 nospam /kernel: (pass2:ncr0:0:0:0): COMMAND FAILED (9 80) @0xc0a84400.
> Aug 27 21:28:33 nospam /kernel: (pass2:ncr0:0:0:0): extraneous data discarded.
> Aug 27 21:28:33 nospam /kernel: (pass2:ncr0:0:0:0): COMMAND FAILED (9 80) @0xc0a84400.
> 
> >Fix:
> 	
> I don't have a clue. Purchase non-IBM drives?

You can read the defects list on an IBM drive.  Here's an example:

========================================================================
# uname -rs                        
FreeBSD 3.2-STABLE
# camcontrol inquiry da1           
pass1: <IBM DDRS-39130 S97B> Fixed Direct Access SCSI-2 device 
pass1: Serial Number RE2B9804        
pass1: 20.000MB/s transfers (20.000MHz, offset 15), Tagged Queueing Enabled
# camcontrol defects da1 -f phys -G
Got 4 defects:
189:6:72
189:6:73
189:6:74
189:6:75
# camcontrol defects da1 -v -PG -f phys
(pass1:ahc0:0:1:0): READ DEFECT DATA(10). CDB: 37 0 1d 0 0 0 0 fd e8 0 
(pass1:ahc0:0:1:0): error code 0
Got 457 defects:
38:4:188
38:4:189
38:4:190
38:4:191
38:4:192
38:4:193
38:4:194
38:4:195
[ .... lots more .... ]
========================================================================

There are several things to mention here:

- The NCR driver, at this point, is known to be a little flaky sometimes,
  and it's difficult to interpret the error messages from it.  It would be
  helpful if you show the output from one of your Adaptec controllers.  The
  IBM disk on in the above example is on an Adaptec 2940UW (7880) board.

- The only disks I've seen that will return defects in block format are
  Quantum disks.  IBM and Seagate disks generally will not.  They will,
  however, return defects in physical sector format, as I demonstrated above.

- In your example command above, you did not specify either the GLIST or
  PLIST.  That can cause problems with some disks.  You may want to specify
  both the GLIST and PLIST (-PG).  I have a Seagate disk that doesn't seem
  to want to return any defects unless both are specified.

- You should specify the -v switch to camcontrol so you have a chance of
  getting SCSI sense information when the command fails.

I won't deny that there may be a problem with getting the defect lists off
drives in some cases, but I will say that I haven't seen many problems
personally.

I'll need some more information (as outlined above) to get an idea of what
may be wrong here.

Ken
-- 
Kenneth Merry
ken@kdm.org
Comment 2 dkelly 1999-08-29 19:41:47 UTC
"Kenneth D. Merry" writes:
> You can read the defects list on an IBM drive.  Here's an example:
> 
> ========================================================================
> # uname -rs                        
> FreeBSD 3.2-STABLE
> # camcontrol inquiry da1           
> pass1: <IBM DDRS-39130 S97B> Fixed Direct Access SCSI-2 device 
> pass1: Serial Number RE2B9804        
> pass1: 20.000MB/s transfers (20.000MHz, offset 15), Tagged Queueing Enabled
> # camcontrol defects da1 -f phys -G
> Got 4 defects:
> 189:6:72
> 189:6:73
> 189:6:74
> 189:6:75
> # camcontrol defects da1 -v -PG -f phys
> (pass1:ahc0:0:1:0): READ DEFECT DATA(10). CDB: 37 0 1d 0 0 0 0 fd e8 0 
> (pass1:ahc0:0:1:0): error code 0
> Got 457 defects:
> 38:4:188
> 38:4:189
> 38:4:190
> 38:4:191
> 38:4:192
> 38:4:193
> 38:4:194
> 38:4:195
> [ .... lots more .... ]
> ========================================================================
> 
> There are several things to mention here:
> 
> - The NCR driver, at this point, is known to be a little flaky sometimes,
>   and it's difficult to interpret the error messages from it.  It would be
>   helpful if you show the output from one of your Adaptec controllers.  The
>   IBM disk on in the above example is on an Adaptec 2940UW (7880) board.

The DCAS and DDRS drives which are connected to Adaptec controllers are 
at work, beyond my reach at the moment. I'll try the exact example you 
provide... tomorrow.

Meanwhile here is what I get when attempting the slightly different 
command lines provided (Symbios/NCR '875, DCHS):

# camcontrol inquiry
pass2: <IBM OEM DCHS09W 2222> Fixed Direct Access SCSI-2 device 
pass2: Serial Number         68210913
pass2: 20.000MB/s transfers (10.000MHz, offset 15, 16bit), Tagged Queueing Enabled
# camcontrol defects -f bfi
error reading defect list: Input/output error
# camcontrol defects -f phys
error reading defect list: Input/output error
# camcontrol defects da0 -v -PG -f phys
error reading defect list: Input/output error
CAM status is 0

tail -4 /var/log/messages says:
Aug 29 13:23:25 nospam /kernel: (pass2:ncr0:0:0:0): extraneous data discarded.
Aug 29 13:23:27 nospam /kernel: (pass2:ncr0:0:0:0): COMMAND FAILED (9 0) @0xc0a84400.
Aug 29 13:23:27 nospam /kernel: (pass2:ncr0:0:0:0): extraneous data discarded.
Aug 29 13:23:27 nospam /kernel: (pass2:ncr0:0:0:0): COMMAND FAILED (9 0) @0xc0a84400.

The "camcontrol defects da0 -v -PG -f phys" is notable because there 
was a 2 second delay not present in other attempts.

Further experiments suggest -P causes the delay.

> - The only disks I've seen that will return defects in block format are
>   Quantum disks.  IBM and Seagate disks generally will not.  They will,
>   however, return defects in physical sector format, as I demonstrated above.

I used "block" in my earlier example simply because 1) there was no 
default format, and 2) "block" was listed first in camcontrol(1).

> - In your example command above, you did not specify either the GLIST or
>   PLIST.  That can cause problems with some disks.  You may want to specify
>   both the GLIST and PLIST (-PG).  I have a Seagate disk that doesn't seem
>   to want to return any defects unless both are specified.
> 
> - You should specify the -v switch to camcontrol so you have a chance of
>   getting SCSI sense information when the command fails.

"CAM status is 0" appears to be all -v does in this case.

> I won't deny that there may be a problem with getting the defect lists off
> drives in some cases, but I will say that I haven't seen many problems
> personally.
> 
> I'll need some more information (as outlined above) to get an idea of what
> may be wrong here.

Used to have a spare '875 card that I could carry to work to try there. 
But the way things are now I believe I'll be carrying some of my HD's 
at work (they are mine, not work's) home to try. Got a great deal on 
DDRS drives a while back and bought two where only one was called for.

I do have an old narrow 2940 here at home. Its driving my tape drives. 
The DCHS is wide. Not sure how it would behave on a narrow controller 
if I had cables to connect it.

to be continued.

--
David Kelly N4HHE, dkelly@nospam.hiwaay.net
=====================================================================
The human mind ordinarily operates at only ten percent of its
capacity -- the rest is overhead for the operating system.
Comment 3 ken 1999-08-30 06:28:48 UTC
David Kelly wrote...
> "Kenneth D. Merry" writes:
> > You can read the defects list on an IBM drive.  Here's an example:
[ .... ]
> > There are several things to mention here:
> > 
> > - The NCR driver, at this point, is known to be a little flaky sometimes,
> >   and it's difficult to interpret the error messages from it.  It would be
> >   helpful if you show the output from one of your Adaptec controllers.  The
> >   IBM disk on in the above example is on an Adaptec 2940UW (7880) board.
> 
> The DCAS and DDRS drives which are connected to Adaptec controllers are 
> at work, beyond my reach at the moment. I'll try the exact example you 
> provide... tomorrow.
> 
> Meanwhile here is what I get when attempting the slightly different 
> command lines provided (Symbios/NCR '875, DCHS):
> 
> # camcontrol inquiry
> pass2: <IBM OEM DCHS09W 2222> Fixed Direct Access SCSI-2 device 
> pass2: Serial Number         68210913
> pass2: 20.000MB/s transfers (10.000MHz, offset 15, 16bit), Tagged Queueing Enabled
> # camcontrol defects -f bfi
> error reading defect list: Input/output error
> # camcontrol defects -f phys
> error reading defect list: Input/output error

One thing to keep in mind with the above two commands is that at best,
it will only return the number of defects.  It doesn't work on some drives.

> # camcontrol defects da0 -v -PG -f phys
> error reading defect list: Input/output error
> CAM status is 0
> 
> tail -4 /var/log/messages says:
> Aug 29 13:23:25 nospam /kernel: (pass2:ncr0:0:0:0): extraneous data discarded.
> Aug 29 13:23:27 nospam /kernel: (pass2:ncr0:0:0:0): COMMAND FAILED (9 0) @0xc0a84400.
> Aug 29 13:23:27 nospam /kernel: (pass2:ncr0:0:0:0): extraneous data discarded.
> Aug 29 13:23:27 nospam /kernel: (pass2:ncr0:0:0:0): COMMAND FAILED (9 0) @0xc0a84400.
> 
> The "camcontrol defects da0 -v -PG -f phys" is notable because there 
> was a 2 second delay not present in other attempts.
> 
> Further experiments suggest -P causes the delay.
> 
> > - The only disks I've seen that will return defects in block format are
> >   Quantum disks.  IBM and Seagate disks generally will not.  They will,
> >   however, return defects in physical sector format, as I demonstrated above.
> 
> I used "block" in my earlier example simply because 1) there was no 
> default format, and 2) "block" was listed first in camcontrol(1).

Hmm, maybe I should put stronger wording in the man page to warn people
that many drives don't support block format.

> > - In your example command above, you did not specify either the GLIST or
> >   PLIST.  That can cause problems with some disks.  You may want to specify
> >   both the GLIST and PLIST (-PG).  I have a Seagate disk that doesn't seem
> >   to want to return any defects unless both are specified.
> > 
> > - You should specify the -v switch to camcontrol so you have a chance of
> >   getting SCSI sense information when the command fails.
> 
> "CAM status is 0" appears to be all -v does in this case.

Okay, that means we're not getting any error codes back.

> > I won't deny that there may be a problem with getting the defect lists off
> > drives in some cases, but I will say that I haven't seen many problems
> > personally.
> > 
> > I'll need some more information (as outlined above) to get an idea of what
> > may be wrong here.
> 
> Used to have a spare '875 card that I could carry to work to try there. 
> But the way things are now I believe I'll be carrying some of my HD's 
> at work (they are mine, not work's) home to try. Got a great deal on 
> DDRS drives a while back and bought two where only one was called for.

It would be better if you could give me results from an Adaptec controller,
not more results from an NCR controller.  I have reason to suspect the NCR
driver may be the problem here.

> I do have an old narrow 2940 here at home. Its driving my tape drives. 
> The DCHS is wide. Not sure how it would behave on a narrow controller 
> if I had cables to connect it.

It should behave just fine, if you connect it up properly.

Based on what other folks have said, I think there may be something with
the NCR driver that's causing the problem.  Other folks have been able to
read defect lists just fine with various drives (IBMs included) with
Adaptec, BusLogic and Advansys controllers.

Steinar Haug reports that reading defects with an Adaptec controller works
fine, but it doesn't work with an NCR controller.

Ken
-- 
Kenneth Merry
ken@kdm.org
Comment 4 Kenneth D. Merry freebsd_committer freebsd_triage 1999-09-27 06:58:19 UTC
Responsible Changed
From-To: freebsd-bugs->ken

I'll handle this one. 
Comment 5 Kenneth D. Merry freebsd_committer freebsd_triage 1999-09-27 07:19:04 UTC
State Changed
From-To: open->feedback

David - Gerard Roudier traced this down to a bug in camcontrol.  I checked 
the fix in to src/sbin/camcontrol/camcontrol.c in revision 1.15 in -current 
and revision 1.9.2.5 in -stable.  Please grab the appropriate version 
and see if this fixes your problem.  Let me know whether it works or not. 

Comment 6 Kenneth D. Merry freebsd_committer freebsd_triage 1999-09-28 04:30:11 UTC
State Changed
From-To: feedback->closed

PR submitter confirms that the fixes mentioned in the last part of the PR 
to camcontrol fix his problem.