| Summary: | 'camcontrol defects' does not work with IBM drives | ||
|---|---|---|---|
| Product: | Base System | Reporter: | dkelly <dkelly> |
| Component: | kern | Assignee: | Kenneth D. Merry <ken> |
| Status: | Closed FIXED | ||
| Severity: | Affects Only Me | CC: | dkelly |
| Priority: | Normal | ||
| Version: | 3.2-STABLE | ||
| Hardware: | Any | ||
| OS: | Any | ||
|
Description
dkelly
1999-08-28 04:10:01 UTC
dkelly@hiwaay.net wrote... > > >Number: 13433 > >Category: kern > >Synopsis: 'camcontrol defects' does not work with IBM drives > >Confidential: no > >Severity: serious > >Priority: medium > >Responsible: freebsd-bugs > >State: open > >Quarter: > >Keywords: > >Date-Required: > >Class: sw-bug > >Submitter-Id: current-users > >Arrival-Date: Fri Aug 27 20:10:01 PDT 1999 > >Closed-Date: > >Last-Modified: > >Originator: David Kelly > >Release: FreeBSD 3.2-STABLE i386 > >Organization: > n/a > >Environment: > > > > >Description: > > Can't read defect lists from IBM SCSI drives. > I don't have non-IBM drives to try. > Same problem with Adaptec and Symbios controllers. /var/log/messages > might have slightly different error codes. See below for Symbios/NCR > example. > > >How-To-Repeat: > > % su -m > # camcontrol defects -f block > error reading defect list: Input/output error > # > > # tail -5 /var/log/messages > Aug 27 21:28:14 nospam su: dkelly to root on /dev/ttyp1 > Aug 27 21:28:33 nospam /kernel: (pass2:ncr0:0:0:0): extraneous data discarded. > Aug 27 21:28:33 nospam /kernel: (pass2:ncr0:0:0:0): COMMAND FAILED (9 80) @0xc0a84400. > Aug 27 21:28:33 nospam /kernel: (pass2:ncr0:0:0:0): extraneous data discarded. > Aug 27 21:28:33 nospam /kernel: (pass2:ncr0:0:0:0): COMMAND FAILED (9 80) @0xc0a84400. > > >Fix: > > I don't have a clue. Purchase non-IBM drives? You can read the defects list on an IBM drive. Here's an example: ======================================================================== # uname -rs FreeBSD 3.2-STABLE # camcontrol inquiry da1 pass1: <IBM DDRS-39130 S97B> Fixed Direct Access SCSI-2 device pass1: Serial Number RE2B9804 pass1: 20.000MB/s transfers (20.000MHz, offset 15), Tagged Queueing Enabled # camcontrol defects da1 -f phys -G Got 4 defects: 189:6:72 189:6:73 189:6:74 189:6:75 # camcontrol defects da1 -v -PG -f phys (pass1:ahc0:0:1:0): READ DEFECT DATA(10). CDB: 37 0 1d 0 0 0 0 fd e8 0 (pass1:ahc0:0:1:0): error code 0 Got 457 defects: 38:4:188 38:4:189 38:4:190 38:4:191 38:4:192 38:4:193 38:4:194 38:4:195 [ .... lots more .... ] ======================================================================== There are several things to mention here: - The NCR driver, at this point, is known to be a little flaky sometimes, and it's difficult to interpret the error messages from it. It would be helpful if you show the output from one of your Adaptec controllers. The IBM disk on in the above example is on an Adaptec 2940UW (7880) board. - The only disks I've seen that will return defects in block format are Quantum disks. IBM and Seagate disks generally will not. They will, however, return defects in physical sector format, as I demonstrated above. - In your example command above, you did not specify either the GLIST or PLIST. That can cause problems with some disks. You may want to specify both the GLIST and PLIST (-PG). I have a Seagate disk that doesn't seem to want to return any defects unless both are specified. - You should specify the -v switch to camcontrol so you have a chance of getting SCSI sense information when the command fails. I won't deny that there may be a problem with getting the defect lists off drives in some cases, but I will say that I haven't seen many problems personally. I'll need some more information (as outlined above) to get an idea of what may be wrong here. Ken -- Kenneth Merry ken@kdm.org "Kenneth D. Merry" writes: > You can read the defects list on an IBM drive. Here's an example: > > ======================================================================== > # uname -rs > FreeBSD 3.2-STABLE > # camcontrol inquiry da1 > pass1: <IBM DDRS-39130 S97B> Fixed Direct Access SCSI-2 device > pass1: Serial Number RE2B9804 > pass1: 20.000MB/s transfers (20.000MHz, offset 15), Tagged Queueing Enabled > # camcontrol defects da1 -f phys -G > Got 4 defects: > 189:6:72 > 189:6:73 > 189:6:74 > 189:6:75 > # camcontrol defects da1 -v -PG -f phys > (pass1:ahc0:0:1:0): READ DEFECT DATA(10). CDB: 37 0 1d 0 0 0 0 fd e8 0 > (pass1:ahc0:0:1:0): error code 0 > Got 457 defects: > 38:4:188 > 38:4:189 > 38:4:190 > 38:4:191 > 38:4:192 > 38:4:193 > 38:4:194 > 38:4:195 > [ .... lots more .... ] > ======================================================================== > > There are several things to mention here: > > - The NCR driver, at this point, is known to be a little flaky sometimes, > and it's difficult to interpret the error messages from it. It would be > helpful if you show the output from one of your Adaptec controllers. The > IBM disk on in the above example is on an Adaptec 2940UW (7880) board. The DCAS and DDRS drives which are connected to Adaptec controllers are at work, beyond my reach at the moment. I'll try the exact example you provide... tomorrow. Meanwhile here is what I get when attempting the slightly different command lines provided (Symbios/NCR '875, DCHS): # camcontrol inquiry pass2: <IBM OEM DCHS09W 2222> Fixed Direct Access SCSI-2 device pass2: Serial Number 68210913 pass2: 20.000MB/s transfers (10.000MHz, offset 15, 16bit), Tagged Queueing Enabled # camcontrol defects -f bfi error reading defect list: Input/output error # camcontrol defects -f phys error reading defect list: Input/output error # camcontrol defects da0 -v -PG -f phys error reading defect list: Input/output error CAM status is 0 tail -4 /var/log/messages says: Aug 29 13:23:25 nospam /kernel: (pass2:ncr0:0:0:0): extraneous data discarded. Aug 29 13:23:27 nospam /kernel: (pass2:ncr0:0:0:0): COMMAND FAILED (9 0) @0xc0a84400. Aug 29 13:23:27 nospam /kernel: (pass2:ncr0:0:0:0): extraneous data discarded. Aug 29 13:23:27 nospam /kernel: (pass2:ncr0:0:0:0): COMMAND FAILED (9 0) @0xc0a84400. The "camcontrol defects da0 -v -PG -f phys" is notable because there was a 2 second delay not present in other attempts. Further experiments suggest -P causes the delay. > - The only disks I've seen that will return defects in block format are > Quantum disks. IBM and Seagate disks generally will not. They will, > however, return defects in physical sector format, as I demonstrated above. I used "block" in my earlier example simply because 1) there was no default format, and 2) "block" was listed first in camcontrol(1). > - In your example command above, you did not specify either the GLIST or > PLIST. That can cause problems with some disks. You may want to specify > both the GLIST and PLIST (-PG). I have a Seagate disk that doesn't seem > to want to return any defects unless both are specified. > > - You should specify the -v switch to camcontrol so you have a chance of > getting SCSI sense information when the command fails. "CAM status is 0" appears to be all -v does in this case. > I won't deny that there may be a problem with getting the defect lists off > drives in some cases, but I will say that I haven't seen many problems > personally. > > I'll need some more information (as outlined above) to get an idea of what > may be wrong here. Used to have a spare '875 card that I could carry to work to try there. But the way things are now I believe I'll be carrying some of my HD's at work (they are mine, not work's) home to try. Got a great deal on DDRS drives a while back and bought two where only one was called for. I do have an old narrow 2940 here at home. Its driving my tape drives. The DCHS is wide. Not sure how it would behave on a narrow controller if I had cables to connect it. to be continued. -- David Kelly N4HHE, dkelly@nospam.hiwaay.net ===================================================================== The human mind ordinarily operates at only ten percent of its capacity -- the rest is overhead for the operating system. David Kelly wrote... > "Kenneth D. Merry" writes: > > You can read the defects list on an IBM drive. Here's an example: [ .... ] > > There are several things to mention here: > > > > - The NCR driver, at this point, is known to be a little flaky sometimes, > > and it's difficult to interpret the error messages from it. It would be > > helpful if you show the output from one of your Adaptec controllers. The > > IBM disk on in the above example is on an Adaptec 2940UW (7880) board. > > The DCAS and DDRS drives which are connected to Adaptec controllers are > at work, beyond my reach at the moment. I'll try the exact example you > provide... tomorrow. > > Meanwhile here is what I get when attempting the slightly different > command lines provided (Symbios/NCR '875, DCHS): > > # camcontrol inquiry > pass2: <IBM OEM DCHS09W 2222> Fixed Direct Access SCSI-2 device > pass2: Serial Number 68210913 > pass2: 20.000MB/s transfers (10.000MHz, offset 15, 16bit), Tagged Queueing Enabled > # camcontrol defects -f bfi > error reading defect list: Input/output error > # camcontrol defects -f phys > error reading defect list: Input/output error One thing to keep in mind with the above two commands is that at best, it will only return the number of defects. It doesn't work on some drives. > # camcontrol defects da0 -v -PG -f phys > error reading defect list: Input/output error > CAM status is 0 > > tail -4 /var/log/messages says: > Aug 29 13:23:25 nospam /kernel: (pass2:ncr0:0:0:0): extraneous data discarded. > Aug 29 13:23:27 nospam /kernel: (pass2:ncr0:0:0:0): COMMAND FAILED (9 0) @0xc0a84400. > Aug 29 13:23:27 nospam /kernel: (pass2:ncr0:0:0:0): extraneous data discarded. > Aug 29 13:23:27 nospam /kernel: (pass2:ncr0:0:0:0): COMMAND FAILED (9 0) @0xc0a84400. > > The "camcontrol defects da0 -v -PG -f phys" is notable because there > was a 2 second delay not present in other attempts. > > Further experiments suggest -P causes the delay. > > > - The only disks I've seen that will return defects in block format are > > Quantum disks. IBM and Seagate disks generally will not. They will, > > however, return defects in physical sector format, as I demonstrated above. > > I used "block" in my earlier example simply because 1) there was no > default format, and 2) "block" was listed first in camcontrol(1). Hmm, maybe I should put stronger wording in the man page to warn people that many drives don't support block format. > > - In your example command above, you did not specify either the GLIST or > > PLIST. That can cause problems with some disks. You may want to specify > > both the GLIST and PLIST (-PG). I have a Seagate disk that doesn't seem > > to want to return any defects unless both are specified. > > > > - You should specify the -v switch to camcontrol so you have a chance of > > getting SCSI sense information when the command fails. > > "CAM status is 0" appears to be all -v does in this case. Okay, that means we're not getting any error codes back. > > I won't deny that there may be a problem with getting the defect lists off > > drives in some cases, but I will say that I haven't seen many problems > > personally. > > > > I'll need some more information (as outlined above) to get an idea of what > > may be wrong here. > > Used to have a spare '875 card that I could carry to work to try there. > But the way things are now I believe I'll be carrying some of my HD's > at work (they are mine, not work's) home to try. Got a great deal on > DDRS drives a while back and bought two where only one was called for. It would be better if you could give me results from an Adaptec controller, not more results from an NCR controller. I have reason to suspect the NCR driver may be the problem here. > I do have an old narrow 2940 here at home. Its driving my tape drives. > The DCHS is wide. Not sure how it would behave on a narrow controller > if I had cables to connect it. It should behave just fine, if you connect it up properly. Based on what other folks have said, I think there may be something with the NCR driver that's causing the problem. Other folks have been able to read defect lists just fine with various drives (IBMs included) with Adaptec, BusLogic and Advansys controllers. Steinar Haug reports that reading defects with an Adaptec controller works fine, but it doesn't work with an NCR controller. Ken -- Kenneth Merry ken@kdm.org Responsible Changed From-To: freebsd-bugs->ken I'll handle this one. State Changed From-To: open->feedback David - Gerard Roudier traced this down to a bug in camcontrol. I checked the fix in to src/sbin/camcontrol/camcontrol.c in revision 1.15 in -current and revision 1.9.2.5 in -stable. Please grab the appropriate version and see if this fixes your problem. Let me know whether it works or not. State Changed From-To: feedback->closed PR submitter confirms that the fixes mentioned in the last part of the PR to camcontrol fix his problem. |