Bug 72451 - Continuing problems with Silicon Image SATA controllers
Summary: Continuing problems with Silicon Image SATA controllers
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 5.3-BETA5
Hardware: Any Any
: Normal Affects Only Me
Assignee: Søren Schmidt
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-10-08 18:00 UTC by Mikhail T.
Modified: 2005-04-11 19:40 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mikhail T. 2004-10-08 18:00:51 UTC
	Under _combined_ disk and CPU load, the following errors start
	popping up:

ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=53404031
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=54910687
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=56806527
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=61715903
ad6: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=62103999
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=176444927
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=311594591
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=196040671
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=306623743

	After a while, all disk IO starts hanging and even a gracefull
	reboot becomes impossible -- the machine hangs after saying:
	"some processes would not die..."

	We replaced the disk and the cables twice already.

	Under just the disk load, the problem does not appear -- the
	box survives a full run of `iozone -a' without a hitch, for
	example.

	But when we, for example, dump databases on it (over NFS) and,
	at the same time, gzip the dump for archiving, we see this.

	Or, when a big file is being uploaded with scp over a fast link
	with ssh compression. So it looks like something inside the
	ata driver is not attended to fast enough...

How-To-Repeat: 
	Run `iozone -a' on a disk, while gzip-ing a big file off of
	the same drive.
Comment 1 Simon L. B. Nielsen freebsd_committer freebsd_triage 2004-10-08 18:24:57 UTC
Responsible Changed
From-To: freebsd-bugs->sos

Over to sos for evaluation.
Comment 2 Søren Schmidt freebsd_committer freebsd_triage 2004-10-11 12:50:31 UTC
Responsible Changed
From-To: sos->feedback

There has been quite some changes since beta5, please update to the 
latest releng5 or at least beta7 (-current would be even better) and 
get back to me with that brings.
Comment 3 Søren Schmidt freebsd_committer freebsd_triage 2004-10-11 12:53:58 UTC
State Changed
From-To: open->feedback

Hmm seems vi was off by 1 line :) 


Comment 4 Søren Schmidt freebsd_committer freebsd_triage 2004-10-11 12:53:58 UTC
Responsible Changed
From-To: feedback->sos
Comment 5 Mikhail Teterin 2004-10-11 20:27:33 UTC
Well, so far, I can't even rebuild the world -- on a freshly rebooted
system :-( According to top, two C-compilers and the cap_mk are stuck
in the `nbufkv' state and the CPU is 100% idle.

I'll reboot again and try with NFS-mounted /usr/obj ...

 -mi
Comment 6 Mikhail Teterin 2004-10-11 23:44:13 UTC
I used the NFS-mounted /usr/obj and noticed the same WRITE_DMA errors
on the NFS-server itself.

[...]
Oct 11 17:48:16 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=64822936
Oct 11 17:49:58 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=65516532
Oct 11 17:54:53 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=65524832
Oct 11 17:59:19 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=65012992
Oct 11 18:03:18 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=67395724
Oct 11 18:03:25 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=66939680
Oct 11 18:03:58 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=67406280
Oct 11 18:05:45 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=67715240
Oct 11 18:33:06 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=66682804
Oct 11 18:33:52 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=67056960
Oct 11 18:37:13 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=66563776
Oct 11 18:38:02 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=66713420
Oct 11 18:38:11 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=66717516
Oct 11 18:38:18 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=66563904
Oct 11 18:38:30 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=66717824
Oct 11 18:38:41 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=66678432
Oct 11 18:38:56 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=66563936
Oct 11 18:39:06 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=66563936
Oct 11 18:39:19 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=66716824
Oct 11 18:39:25 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=66716656
Oct 11 18:39:33 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=66563968
Oct 11 18:39:40 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=66718804
Oct 11 18:39:46 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=66722976
Oct 11 18:39:55 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=66563968
Oct 11 18:40:11 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=66725152
Oct 11 18:40:18 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=66564000
Oct 11 18:40:30 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) 
LBA=66721364
[...]

The server is running:

FreeBSD mi 6.0-CURRENT FreeBSD 6.0-CURRENT #1: Tue Oct  5 17:56:47 EDT 2004     
root@mi:/var/obj/usr/src/sys/Gigabyte  i386

with the following versions of ATA-files in the kernel:

     $FreeBSD: src/sys/dev/ata/ata-all.c,v 1.228 2004/09/26 11:48:43 sos Exp $
     $FreeBSD: src/sys/dev/ata/ata-queue.c,v 1.35 2004/09/26 11:48:43 sos Exp 
$
     $FreeBSD: src/sys/dev/ata/ata-lowlevel.c,v 1.48 2004/09/26 11:48:43 sos 
Exp $
     $FreeBSD: src/sys/dev/ata/ata-isa.c,v 1.22 2004/04/30 16:21:34 sos Exp $
     $FreeBSD: src/sys/dev/ata/ata-pci.c,v 1.89 2004/09/26 11:42:42 sos Exp $
     $FreeBSD: src/sys/dev/ata/ata-chipset.c,v 1.90 2004/10/01 09:06:22 sos 
Exp $
     $FreeBSD: src/sys/dev/ata/ata-dma.c,v 1.131 2004/09/10 10:31:37 sos Exp $
     $FreeBSD: src/sys/dev/ata/ata-disk.c,v 1.179 2004/09/30 20:54:59 sos Exp 
$
     $FreeBSD: src/sys/dev/ata/atapi-cd.c,v 1.171 2004/08/24 10:39:00 sos Exp 
$
     $FreeBSD: src/sys/dev/ata/atapi-fd.c,v 1.97 2004/08/05 21:11:33 sos Exp $

and the Silicon Image-3112A on-board controller with the new "Raptor"
1000K RPM drive:

atapci1: <SiI 3112 SATA150 controller> port 
0xc400-0xc40f,0xc000-0xc003,0xbc00-0xbc07,0xb800-0xb803,0xb400-0xb407 mem 
0xf7025000-0xf70251ff irq 17 at device 16.0 on pci0
ad4: 35304MB <WDC WD360GD-00FNA0/35.06K35> [71730/16/63] at ata2-master 
SATA150

So, the problem was still here in Oct 5 -current and is not limited
to amd64 :-(

The NFS-server is not hanging, fortunately, just stumbles for a
while and recovers. The hangs originally reported are, probably,
due to a different issue (see nbufkv-thread on -current), but the
SATA errors are still a problem.

Yours,

 -mi
Comment 7 Mikhail T. 2004-10-12 04:36:30 UTC
Ok, the machine rebooted with today's world and kernel and -- under the
same kind of load -- shows the same behaviour: lots of messages like:

'ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=...'

the reported LBA is always different, and the errors are from 2 seconds
to a few minutes apart from each other.

 -mi
Comment 8 Mikhail Teterin 2004-10-13 23:00:33 UTC
Just noticed fresh changes in dev/ata and rebuilt the kernel.

Under heavy load machine started reporting A LOT of WRITE_DMA errors
and hung after a few minutes:

[...]
Oct 13 17:22:55 pandora kernel: ad6: FAILURE - WRITE_DMA 
status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=123751359
Oct 13 17:22:56 pandora kernel: ad6: FAILURE - WRITE_DMA 
status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=34785759
Oct 13 17:22:59 pandora kernel: ad6: FAILURE - WRITE_DMA 
status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=130147711
Oct 13 17:23:03 pandora kernel: ad6: FAILURE - WRITE_DMA 
status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=109841727
Oct 13 17:23:03 pandora kernel: ad6: FAILURE - WRITE_DMA 
status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=110594751
Oct 13 17:23:07 pandora kernel: ad6: FAILURE - WRITE_DMA 
status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=137684767
Oct 13 17:23:10 pandora kernel: ad6: FAILURE - WRITE_DMA 
status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=80481919
[...]

At the time of hanging I had the `systat 1 -vm' running -- the screen
froze with 10Mb/s going to ad6...

May, it is possible to force the disk into slower mode -- like SATA66 or
something? The most we ever saw sustained on it was 44Mb/s read and 21Mb/s
write anyway... Thanks,

	-mi
Comment 9 Mikhail Teterin 2004-10-19 22:15:19 UTC
I rebuilt the kernel from today's -current sources and this time included
all the WITNESS and INVARIANT options. The machine duly hung as before,
but there were no messages from WITNESS nor INVARIANT -- only the WRITE_DMA
warnings.

The `systat 1 -vm' froze at:

    3 users    Load  1.08  0.83  0.66                  19 ÖÏ× 16:25

Mem:KB    REAL            VIRTUAL                     VN PAGER  SWAP PAGER
        Tot   Share      Tot    Share    Free         in  out     in  out
Act   27256    5308    65880    13608  405040 count
All 1700040    7616  2959492    17720         pages
                                                          zfod   Interrupts
Proc:r  p  d  s  w    Csw  Trp  Sys  Int  Sof  Flt        cow    5310 total
           5 38     11789    6  168 8051    4      248200 wire        irq1: atkb
                                                    28512 act    1028 irq0: clk
12.5%Sys   5.5%Intr  0.0%User  0.0%Nice 82.0%Idl  1336996 inact       irq6: fdc0
|    |    |    |    |    |    |    |    |    |      89408 cache   128 irq8: rtc
======+++                                          315632 free    722 irq9: acpi
                                                          daefr       irq14: ata
Namei         Name-cache    Dir-cache                     prcfr       irq15: ata
    Calls     hits    %     hits    %                     react  1352 irq16: ahc
                                                          pdwak   722 irq17: pcm
                                                          pdpgs       irq19: fwo
Disks  afd0   ad6 amrd0   sa0 pass0                       intrn  1358 irq24: bge
KB/t   0.00 16.91  0.00  0.00  0.00                218880 buf         irq26: amr
tps       0   720     0     0     0                  1365 dirtybuf
MB/s   0.00 11.88  0.00  0.00  0.00                100000 desiredvnodes
% busy    0   100     0     0     0                   849 numvnodes
Showing vmstat, refresh every 1 seconds.              514

The `systat 1 -if' froze as:

                    /0   /1   /2   /3   /4   /5   /6   /7   /8   /9   /10
     Load Average   |||||   

      Interface           Traffic               Peak                Total


            lo0  in      0.000 KB/s          3.197 KB/s          213.939 KB
                 out     0.000 KB/s          3.197 KB/s          213.939 KB

           bge0  in      4.938 MB/s         11.728 MB/s           10.906 GB
                 out   185.929 KB/s        450.869 KB/s          415.249 MB


It would seem to me, this is a show-stopper for 5.3 release -- amd64 is a
Tier1 platform, yet it can't write to the disk for long on popular hardware.

Yours,

 -mi
Comment 10 Mikhail Teterin 2004-10-19 23:06:48 UTC
=How much RAM is in this system?

2Gb. Single Opteron in a dual-capable Tyan K8W motherboard.

 -mi
Comment 11 Scott Long freebsd_committer freebsd_triage 2004-10-19 23:20:09 UTC
Mikhail Teterin wrote:
> =How much RAM is in this system?
> 
> 2Gb. Single Opteron in a dual-capable Tyan K8W motherboard.
> 
>  -mi

Ok, I can't think of any obvious causes.  The 5.3 BETAs have been built
on a dual Opteron with 2GB and a SATA disks without any problems.

Scott
Comment 12 Mikhail Teterin 2004-10-19 23:37:54 UTC
=Mikhail Teterin wrote:
=> =How much RAM is in this system?

=> 2Gb. Single Opteron in a dual-capable Tyan K8W motherboard.

=Ok, I can't think of any obvious causes.  The 5.3 BETAs have been built
=on a dual Opteron with 2GB and a SATA disks without any problems.

Were these disks attached to a Silicon Image controller? Also, my problem 
seems to be somehow related with network traffic. The way, we cause the 
system to hang, is by telling a remote Sybase server to dump its databases 
onto it (over NFS) one at a time. It does not have to be NFS-server -- when I 
tried to restore(8) a filesystem on this SATA drive from the dump stored on 
on another machine, the machine hung the same way.

I just rebuilt the kernel again with the NET_WITH_GIANT (to set the mpsafenet 
to 0) and it looks much better -- about an hour into the process, that used 
to hang it within 25 minutes or so. If it survives the whole dump by 
tomorrow, than -- I guess -- there is some interaction between the network 
and ata. I'm convinced, ata is somehow involved, because when we write onto 
our raid array (amrd0) on the same machine, there are no hangs.

Could it be that ata is generally safe, but the recovery from 
WRITE_DMA/READ_DMA failures is not?

 -mi
Comment 13 Mikhail Teterin 2004-10-22 19:05:27 UTC
After putting my new workstation into heavier use, I'm seeing this
same problems on i386. This machine (current from Oct 20) has two
Raptor disks connected to the two on-board connectors of Silicon Image
3112A controller.

There are occasional WRITE_DMA errors from these drives. Once in a
while, the machine locks solid after one of such errors.

A good way of reproducing seems to be to run cvsup mirroring the
entire CVS-repository to one of the local drives, while cvs is
extracting the src tree from the local repository onto another drive.
This may succeed, but may also lead to either a hang or a ufs-panic.

I plan to change the status of this PR back to 'open' -- if you need
more 'feedback', just ask. Thanks!

	-mi
Comment 14 sos 2004-10-22 19:07:53 UTC
Mikhail Teterin wrote:
> After putting my new workstation into heavier use, I'm seeing this
> same problems on i386. This machine (current from Oct 20) has two
> Raptor disks connected to the two on-board connectors of Silicon Image
> 3112A controller.
>=20
> There are occasional WRITE_DMA errors from these drives. Once in a
> while, the machine locks solid after one of such errors.
>=20
> A good way of reproducing seems to be to run cvsup mirroring the
> entire CVS-repository to one of the local drives, while cvs is
> extracting the src tree from the local repository onto another drive.
> This may succeed, but may also lead to either a hang or a ufs-panic.
>=20
> I plan to change the status of this PR back to 'open' -- if you need
> more 'feedback', just ask. Thanks!

Well I can beat on mine (sii3112 and 70G raptors) for days wihtout a=20
hickup. Whats the motherboard its sitting in ?


--=20

-S=F8ren
Comment 15 Mikhail Teterin 2004-10-25 17:05:49 UTC
=Well I can beat on mine (sii3112 and 70G raptors) for days without hickup. 
=Whats the motherboard its sitting in?

I see this in two computers now:

 i386: Pentium4 @3.06GHz in Gigabyte's GA-SINXP1394, SiI-3112A controller(s)),
 amd64: single Opteron @1.8GHz in a dual-capable Tyan K8W, SiI-3114

Directing a Sybase server (on Solaris) to dump its databases onto FreeBSD 
disks will wedge the systems reliably, although not immediately. Full dump of 
all databases here takes about 15 hours (we estimate). The amd64 machine 
running 5.3-stable hangs about 6 hours into it. (Under recent -current it 
tends to panic much earlier.)

The FreeBSD machines both have gigabit cards (bge on opteron, em on i386), the 
NFS client -- Sybase -- has only a 100Mb card, but manages to sustain above 
11Mb/s writes anyway. After some time the FreeBSD's disks begin to glitch. At 
a certain point, machines hang :-(

Some hangs seem to have occurred during rwhod's updates (in parallel to the 
Sybase writes), for example -- /var is a different partition from the one 
being pounded, but resides on the same disk.

May be, the difference is in the source of "pounding" -- network (rwhod, cvsup 
or NFS), rather than local writing/reading? 

 -mi
Comment 16 Mikhail Teterin freebsd_committer freebsd_triage 2004-10-26 18:13:13 UTC
State Changed
From-To: feedback->open

Fulfill the earlier threat :-)
Comment 17 Mikhail Teterin 2004-10-26 18:38:25 UTC
Another day, another hang. Right now, the machine is up, but the disk seems 
dead -- the running processes run (systat keeps me updated once per second), 
but the paged out ones can't be paged back in (vm_pager complaining about 
disk), and no new processes can start. The machine was in this state for 
about 11 hours now -- after 6 or 7 hours of hard work and great many 
WRITE_DMA failures. Can the driver, perhaps, slow the channel speed gradually 
on errors -- from SATA150 down to 120, say -- either automatically, or 
through a sysctl knob? Thanks!

Someone in the Linux world has similar troubles too:

 http://www.ussg.iu.edu/hypermail/linux/kernel/0407.2/0127.html

SiI's own IDE-driver for Linux can be obtained here:

 http://12.24.47.40/display/2n/kb/article.asp?aid=10485&s=1

Perhaps, someone with knowledge of ATA (hint-hint), can look at the two files 
(siimage.c, and siimage.h) to see immediately, what kind of a work-around is 
needed for SiI to work reliably? Thanks!

 -mi
Comment 18 neil.hoggarth 2004-11-10 15:21:18 UTC
The Linux folk seem to have come to the conclusion that SiL controllers
issue unusual (but SATA spec compliant) writes to the drive, if the host
sends more than 15 LBAs of data to the controller in a single DMA.

These unusual SATA transfers expose problems with the firmware in some
drives, including a number of Seagate models.

The Linux sata_sil driver now has a quirks mode, which prevents writes
larger than 15 blocks if the attached drive is on a blacklist. This
workaround apparently gives stability at the cost of performance. The
only other suggested fix seems to be "don't use a blacklisted drive and
an SiL controller". :-(

Here is one thread about this in the linux-kernel mailing list archives:

  http://lkml.org/lkml/2004/8/12/8

Regards,
-- 
Neil Hoggarth                                Departmental Computing Manager
<neil.hoggarth@physiol.ox.ac.uk>                   Laboratory of Physiology
http://www.physiol.ox.ac.uk/~njh/                  University of Oxford, UK
Comment 19 Mikhail.Teterin 2004-11-12 00:00:21 UTC
I'm seeing these problems with different drives on different machines (i386, 
amd64). I suspected, the problem may have something to do with the disk 
overheating, so we tried to shut the machine down for a day.

Upon fresh start, the log is already full of errors -- several per minute.

I tried using ataidle (from /usr/ports/sysutils/ataidle) to alter the drive's 
power consumption and/or acoustic level, but all attempts to do so got

ad6: FAILURE - SETFEATURES 0x05 status=51<READY,DSC,ERROR> error=4<ABORTED>

ad6: FAILURE - SETFEATURES 0x42 status=51<READY,DSC,ERROR> error=4<ABORTED>

for all possible power and acoustic levels. Are these features supposed to 
work for SATA drives?

The worst part is, however, that when the driver gives up (FAILURE). The 
programs may die (segfaulting vm_pager) or the OS can panic -- in filesystem 
code.

May be, the driver should try to _really_ restart the drive -- by forcing it 
to spin down, wait, and spin up again? If NFS client code is willing to wait 
forever for the remote server to come back up, why can't ata?

Yours,

 -mi
Comment 20 Søren Schmidt freebsd_committer freebsd_triage 2005-04-11 12:18:15 UTC
State Changed
From-To: open->closed

There is no idea in spinning down then up the drive it doesn't accomplish 
anything but drive bearing wear. The drive electronics needs to be reset and 
that has been done in ATA for ages. 
You should try the latest -current as and check what that brings, I'm using 
a sii3112 and problematic drives in a server here just to have real life 
testing, and that works, modulo the eventual timeouts thats expected.
Comment 21 Mikhail Teterin 2005-04-11 19:30:44 UTC
> modulo the eventual timeouts thats expected.

I am afraid, you misunderstood my suggestion. I think, the driver should not 
give up in case of a timeout. Some places in the kernel do not expect such a 
failure -- pager may panic if it can not write a page to swap, file systems 
may become corrupted if they can not flush cached data.

If NFS client can (and by default -- will) wait forever for the server to come 
back, the ATA code should wait forever for the writing to succeed.

It is difficult for me to try the new code, because the original motherboard, 
where we saw the problem has long been replaced and the other system no 
longer exibits the problem since I tinkered with PCI timings a little bit.

	-mi