Bug 165982 - [mpt] mpt instability, drive resets, and losses on FreeBSD 9-stable r232224
Summary: [mpt] mpt instability, drive resets, and losses on FreeBSD 9-stable r232224
Status: Closed Overcome By Events
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 9.0-STABLE
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-03-12 17:10 UTC by jonathan
Modified: 2018-12-19 18:06 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description jonathan 2012-03-12 17:10:10 UTC
	I upgraded to 9-stable and around the same time had drive failures.  
Now when doing heavy I/O to drives attached to the mpt controller I get errors
such as 
(da3:mpt0:0:14:0): WRITE(10). CDB: 2a 0 42 0 45 f0 0 0 8 0
(da3:mpt0:0:14:0): CAM status: SCSI Status Error
(da3:mpt0:0:14:0): SCSI status: Check Condition
(da3:mpt0:0:14:0): SCSI sense: MEDIUM ERROR asc:14,1 (Record not found)
and
(da7:mpt0:0:13:0): SCSI status error
(da7:mpt0:0:13:0): WRITE(10). CDB: 2a 0 2a 0 a6 10 0 0 28 0
(da7:mpt0:0:13:0): CAM status: SCSI Status Error
(da7:mpt0:0:13:0): SCSI status: Check Condition
(da7:mpt0:0:13:0): SCSI sense: ABORTED COMMAND asc:0,0 (No additional sense information)
(da7:mpt0:0:13:0): Retrying command (per sense data)
and
(da7:mpt0:0:13:0): SCSI status error
(da7:mpt0:0:13:0): READ(10). CDB: 28 0 29 e d7 0 0 0 38 0
(da7:mpt0:0:13:0): CAM status: SCSI Status Error
(da7:mpt0:0:13:0): SCSI status: Check Condition
(da7:mpt0:0:13:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
(da7:mpt0:0:13:0): Retrying command (per sense data)
(da7:mpt0:0:13:0): CAM status 0x18
(da7:mpt0:0:13:0): Retrying command
(da7:mpt0:0:13:0): CAM status 0x18
and
(da7:mpt0:0:13:0): SCSI status error
(da7:mpt0:0:13:0): WRITE(10). CDB: 2a 0 5c 0 2c b0 0 0 8 0
(da7:mpt0:0:13:0): CAM status: SCSI Status Error
(da7:mpt0:0:13:0): SCSI status: Check Condition
(da7:mpt0:0:13:0): SCSI sense: ABORTED COMMAND asc:0,0 (No additional sense information)
(da7:mpt0:0:13:0): Error 5, Retries exhausted
mpt0: request 0xffffff8001a68060:30428 timed out for ccb 0xfffffe00072d6800 (req->ccb 0xfffffe00072d6800)
mpt0: request 0xffffff8001a73cd0:30429 timed out for ccb 0xfffffe0007453800 (req->ccb 0xfffffe0007453800)
mpt0: attempting to abort req 0xffffff8001a68060:30428 function 0
mpt0: request 0xffffff8001a682a0:30430 timed out for ccb 0xfffffe001202c800 (req->ccb 0xfffffe001202c800)
mpt0: request 0xffffff8001a745d0:30431 timed out for ccb 0xfffffe0007408000 (req->ccb 0xfffffe0007408000)
mpt0: request 0xffffff8001a72da0:30432 timed out for ccb 0xfffffe0006842800 (req->ccb 0xfffffe0006842800)
mpt0: request 0xffffff8001a73a00:30433 timed out for ccb 0xfffffe0007fcd000 (req->ccb 0xfffffe0007fcd000)
mpt0: request 0xffffff8001a66710:30434 timed out for ccb 0xfffffe000740a000 (req->ccb 0xfffffe000740a000)
mpt0: mpt_wait_req(1) timed out
mpt0: mpt_recover_commands: abort timed-out. Resetting controller
mpt0: mpt_cam_event: 0x80
mpt0: Unhandled Event Notify Frame. Event 0xffffff80 (ACK not required).
mpt0: completing timedout/aborted req 0xffffff8001a68060:30428
mpt0: completing timedout/aborted req 0xffffff8001a73cd0:30429
mpt0: completing timedout/aborted req 0xffffff8001a682a0:30430
mpt0: completing timedout/aborted req 0xffffff8001a745d0:30431
mpt0: completing timedout/aborted req 0xffffff8001a72da0:30432
mpt0: completing timedout/aborted req 0xffffff8001a73a00:30433
mpt0: completing timedout/aborted req 0xffffff8001a66710:30434
(da7:mpt0:0:13:0): Bus Reset issued
(da7:mpt0:0:13:0): Retrying command
and finally
(da7:mpt0:0:13:0): SCSI status error
(da7:mpt0:0:13:0): READ(10). CDB: 28 0 29 e d6 f8 0 0 8 0
(da7:mpt0:0:13:0): CAM status: SCSI Status Error
(da7:mpt0:0:13:0): SCSI status: Check Condition
(da7:mpt0:0:13:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error)
(da7:mpt0:0:13:0): Info: 0x290ed6f8
(da7:mpt0:0:13:0): Error 5, Unretryable error
mpt0: request 0xffffff8001a6bff0:30904 timed out for ccb 0xfffffe0007453800 (req->ccb 0xfffffe0007453800)
mpt0: attempting to abort req 0xffffff8001a6bff0:30904 function 0
mpt0: mpt_send_handshake_cmd: db ignored
mpt0: soft reset failed: device not running
mpt0: WARNING - Failed hard reset! Trying to initialize anyway.
mpt0: mpt_cam_event: 0xff
mpt0: Unhandled Event Notify Frame. Event 0xffffffff (ACK not required).
mpt0: completing timedout/aborted req 0xffffff8001a6bff0:30904
(da0:mpt0:0:1:0): Bus Reset issued
(da0:mpt0:0:1:0): Retrying command

Eventually the system gets into a state where no disk I/O happens at all and 
multiple drives are lost and I have to reset it.

How-To-Repeat: 	Place a heavy I/O load on an MTP controller with SATA drives.
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2012-03-14 17:21:39 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-scsi

Over to maintainer(s).
Comment 2 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 07:59:43 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped