Under _combined_ disk and CPU load, the following errors start popping up: ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=53404031 ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=54910687 ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=56806527 ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=61715903 ad6: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=62103999 ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=176444927 ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=311594591 ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=196040671 ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=306623743 After a while, all disk IO starts hanging and even a gracefull reboot becomes impossible -- the machine hangs after saying: "some processes would not die..." We replaced the disk and the cables twice already. Under just the disk load, the problem does not appear -- the box survives a full run of `iozone -a' without a hitch, for example. But when we, for example, dump databases on it (over NFS) and, at the same time, gzip the dump for archiving, we see this. Or, when a big file is being uploaded with scp over a fast link with ssh compression. So it looks like something inside the ata driver is not attended to fast enough... How-To-Repeat: Run `iozone -a' on a disk, while gzip-ing a big file off of the same drive.
Responsible Changed From-To: freebsd-bugs->sos Over to sos for evaluation.
Responsible Changed From-To: sos->feedback There has been quite some changes since beta5, please update to the latest releng5 or at least beta7 (-current would be even better) and get back to me with that brings.
State Changed From-To: open->feedback Hmm seems vi was off by 1 line :)
Responsible Changed From-To: feedback->sos
Well, so far, I can't even rebuild the world -- on a freshly rebooted system :-( According to top, two C-compilers and the cap_mk are stuck in the `nbufkv' state and the CPU is 100% idle. I'll reboot again and try with NFS-mounted /usr/obj ... -mi
I used the NFS-mounted /usr/obj and noticed the same WRITE_DMA errors on the NFS-server itself. [...] Oct 11 17:48:16 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=64822936 Oct 11 17:49:58 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=65516532 Oct 11 17:54:53 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=65524832 Oct 11 17:59:19 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=65012992 Oct 11 18:03:18 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=67395724 Oct 11 18:03:25 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=66939680 Oct 11 18:03:58 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=67406280 Oct 11 18:05:45 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=67715240 Oct 11 18:33:06 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=66682804 Oct 11 18:33:52 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=67056960 Oct 11 18:37:13 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=66563776 Oct 11 18:38:02 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=66713420 Oct 11 18:38:11 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=66717516 Oct 11 18:38:18 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=66563904 Oct 11 18:38:30 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=66717824 Oct 11 18:38:41 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=66678432 Oct 11 18:38:56 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=66563936 Oct 11 18:39:06 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=66563936 Oct 11 18:39:19 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=66716824 Oct 11 18:39:25 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=66716656 Oct 11 18:39:33 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=66563968 Oct 11 18:39:40 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=66718804 Oct 11 18:39:46 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=66722976 Oct 11 18:39:55 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=66563968 Oct 11 18:40:11 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=66725152 Oct 11 18:40:18 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=66564000 Oct 11 18:40:30 mi kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=66721364 [...] The server is running: FreeBSD mi 6.0-CURRENT FreeBSD 6.0-CURRENT #1: Tue Oct 5 17:56:47 EDT 2004 root@mi:/var/obj/usr/src/sys/Gigabyte i386 with the following versions of ATA-files in the kernel: $FreeBSD: src/sys/dev/ata/ata-all.c,v 1.228 2004/09/26 11:48:43 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-queue.c,v 1.35 2004/09/26 11:48:43 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-lowlevel.c,v 1.48 2004/09/26 11:48:43 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-isa.c,v 1.22 2004/04/30 16:21:34 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-pci.c,v 1.89 2004/09/26 11:42:42 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-chipset.c,v 1.90 2004/10/01 09:06:22 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-dma.c,v 1.131 2004/09/10 10:31:37 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-disk.c,v 1.179 2004/09/30 20:54:59 sos Exp $ $FreeBSD: src/sys/dev/ata/atapi-cd.c,v 1.171 2004/08/24 10:39:00 sos Exp $ $FreeBSD: src/sys/dev/ata/atapi-fd.c,v 1.97 2004/08/05 21:11:33 sos Exp $ and the Silicon Image-3112A on-board controller with the new "Raptor" 1000K RPM drive: atapci1: <SiI 3112 SATA150 controller> port 0xc400-0xc40f,0xc000-0xc003,0xbc00-0xbc07,0xb800-0xb803,0xb400-0xb407 mem 0xf7025000-0xf70251ff irq 17 at device 16.0 on pci0 ad4: 35304MB <WDC WD360GD-00FNA0/35.06K35> [71730/16/63] at ata2-master SATA150 So, the problem was still here in Oct 5 -current and is not limited to amd64 :-( The NFS-server is not hanging, fortunately, just stumbles for a while and recovers. The hangs originally reported are, probably, due to a different issue (see nbufkv-thread on -current), but the SATA errors are still a problem. Yours, -mi
Ok, the machine rebooted with today's world and kernel and -- under the same kind of load -- shows the same behaviour: lots of messages like: 'ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=...' the reported LBA is always different, and the errors are from 2 seconds to a few minutes apart from each other. -mi
Just noticed fresh changes in dev/ata and rebuilt the kernel. Under heavy load machine started reporting A LOT of WRITE_DMA errors and hung after a few minutes: [...] Oct 13 17:22:55 pandora kernel: ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=123751359 Oct 13 17:22:56 pandora kernel: ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=34785759 Oct 13 17:22:59 pandora kernel: ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=130147711 Oct 13 17:23:03 pandora kernel: ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=109841727 Oct 13 17:23:03 pandora kernel: ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=110594751 Oct 13 17:23:07 pandora kernel: ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=137684767 Oct 13 17:23:10 pandora kernel: ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=80481919 [...] At the time of hanging I had the `systat 1 -vm' running -- the screen froze with 10Mb/s going to ad6... May, it is possible to force the disk into slower mode -- like SATA66 or something? The most we ever saw sustained on it was 44Mb/s read and 21Mb/s write anyway... Thanks, -mi
I rebuilt the kernel from today's -current sources and this time included all the WITNESS and INVARIANT options. The machine duly hung as before, but there were no messages from WITNESS nor INVARIANT -- only the WRITE_DMA warnings. The `systat 1 -vm' froze at: 3 users Load 1.08 0.83 0.66 19 ÖÏ× 16:25 Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER Tot Share Tot Share Free in out in out Act 27256 5308 65880 13608 405040 count All 1700040 7616 2959492 17720 pages zfod Interrupts Proc:r p d s w Csw Trp Sys Int Sof Flt cow 5310 total 5 38 11789 6 168 8051 4 248200 wire irq1: atkb 28512 act 1028 irq0: clk 12.5%Sys 5.5%Intr 0.0%User 0.0%Nice 82.0%Idl 1336996 inact irq6: fdc0 | | | | | | | | | | 89408 cache 128 irq8: rtc ======+++ 315632 free 722 irq9: acpi daefr irq14: ata Namei Name-cache Dir-cache prcfr irq15: ata Calls hits % hits % react 1352 irq16: ahc pdwak 722 irq17: pcm pdpgs irq19: fwo Disks afd0 ad6 amrd0 sa0 pass0 intrn 1358 irq24: bge KB/t 0.00 16.91 0.00 0.00 0.00 218880 buf irq26: amr tps 0 720 0 0 0 1365 dirtybuf MB/s 0.00 11.88 0.00 0.00 0.00 100000 desiredvnodes % busy 0 100 0 0 0 849 numvnodes Showing vmstat, refresh every 1 seconds. 514 The `systat 1 -if' froze as: /0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10 Load Average ||||| Interface Traffic Peak Total lo0 in 0.000 KB/s 3.197 KB/s 213.939 KB out 0.000 KB/s 3.197 KB/s 213.939 KB bge0 in 4.938 MB/s 11.728 MB/s 10.906 GB out 185.929 KB/s 450.869 KB/s 415.249 MB It would seem to me, this is a show-stopper for 5.3 release -- amd64 is a Tier1 platform, yet it can't write to the disk for long on popular hardware. Yours, -mi
=How much RAM is in this system? 2Gb. Single Opteron in a dual-capable Tyan K8W motherboard. -mi
Mikhail Teterin wrote: > =How much RAM is in this system? > > 2Gb. Single Opteron in a dual-capable Tyan K8W motherboard. > > -mi Ok, I can't think of any obvious causes. The 5.3 BETAs have been built on a dual Opteron with 2GB and a SATA disks without any problems. Scott
=Mikhail Teterin wrote: => =How much RAM is in this system? => 2Gb. Single Opteron in a dual-capable Tyan K8W motherboard. =Ok, I can't think of any obvious causes. The 5.3 BETAs have been built =on a dual Opteron with 2GB and a SATA disks without any problems. Were these disks attached to a Silicon Image controller? Also, my problem seems to be somehow related with network traffic. The way, we cause the system to hang, is by telling a remote Sybase server to dump its databases onto it (over NFS) one at a time. It does not have to be NFS-server -- when I tried to restore(8) a filesystem on this SATA drive from the dump stored on on another machine, the machine hung the same way. I just rebuilt the kernel again with the NET_WITH_GIANT (to set the mpsafenet to 0) and it looks much better -- about an hour into the process, that used to hang it within 25 minutes or so. If it survives the whole dump by tomorrow, than -- I guess -- there is some interaction between the network and ata. I'm convinced, ata is somehow involved, because when we write onto our raid array (amrd0) on the same machine, there are no hangs. Could it be that ata is generally safe, but the recovery from WRITE_DMA/READ_DMA failures is not? -mi
After putting my new workstation into heavier use, I'm seeing this same problems on i386. This machine (current from Oct 20) has two Raptor disks connected to the two on-board connectors of Silicon Image 3112A controller. There are occasional WRITE_DMA errors from these drives. Once in a while, the machine locks solid after one of such errors. A good way of reproducing seems to be to run cvsup mirroring the entire CVS-repository to one of the local drives, while cvs is extracting the src tree from the local repository onto another drive. This may succeed, but may also lead to either a hang or a ufs-panic. I plan to change the status of this PR back to 'open' -- if you need more 'feedback', just ask. Thanks! -mi
Mikhail Teterin wrote: > After putting my new workstation into heavier use, I'm seeing this > same problems on i386. This machine (current from Oct 20) has two > Raptor disks connected to the two on-board connectors of Silicon Image > 3112A controller. >=20 > There are occasional WRITE_DMA errors from these drives. Once in a > while, the machine locks solid after one of such errors. >=20 > A good way of reproducing seems to be to run cvsup mirroring the > entire CVS-repository to one of the local drives, while cvs is > extracting the src tree from the local repository onto another drive. > This may succeed, but may also lead to either a hang or a ufs-panic. >=20 > I plan to change the status of this PR back to 'open' -- if you need > more 'feedback', just ask. Thanks! Well I can beat on mine (sii3112 and 70G raptors) for days wihtout a=20 hickup. Whats the motherboard its sitting in ? --=20 -S=F8ren
=Well I can beat on mine (sii3112 and 70G raptors) for days without hickup. =Whats the motherboard its sitting in? I see this in two computers now: i386: Pentium4 @3.06GHz in Gigabyte's GA-SINXP1394, SiI-3112A controller(s)), amd64: single Opteron @1.8GHz in a dual-capable Tyan K8W, SiI-3114 Directing a Sybase server (on Solaris) to dump its databases onto FreeBSD disks will wedge the systems reliably, although not immediately. Full dump of all databases here takes about 15 hours (we estimate). The amd64 machine running 5.3-stable hangs about 6 hours into it. (Under recent -current it tends to panic much earlier.) The FreeBSD machines both have gigabit cards (bge on opteron, em on i386), the NFS client -- Sybase -- has only a 100Mb card, but manages to sustain above 11Mb/s writes anyway. After some time the FreeBSD's disks begin to glitch. At a certain point, machines hang :-( Some hangs seem to have occurred during rwhod's updates (in parallel to the Sybase writes), for example -- /var is a different partition from the one being pounded, but resides on the same disk. May be, the difference is in the source of "pounding" -- network (rwhod, cvsup or NFS), rather than local writing/reading? -mi
State Changed From-To: feedback->open Fulfill the earlier threat :-)
Another day, another hang. Right now, the machine is up, but the disk seems dead -- the running processes run (systat keeps me updated once per second), but the paged out ones can't be paged back in (vm_pager complaining about disk), and no new processes can start. The machine was in this state for about 11 hours now -- after 6 or 7 hours of hard work and great many WRITE_DMA failures. Can the driver, perhaps, slow the channel speed gradually on errors -- from SATA150 down to 120, say -- either automatically, or through a sysctl knob? Thanks! Someone in the Linux world has similar troubles too: http://www.ussg.iu.edu/hypermail/linux/kernel/0407.2/0127.html SiI's own IDE-driver for Linux can be obtained here: http://12.24.47.40/display/2n/kb/article.asp?aid=10485&s=1 Perhaps, someone with knowledge of ATA (hint-hint), can look at the two files (siimage.c, and siimage.h) to see immediately, what kind of a work-around is needed for SiI to work reliably? Thanks! -mi
The Linux folk seem to have come to the conclusion that SiL controllers issue unusual (but SATA spec compliant) writes to the drive, if the host sends more than 15 LBAs of data to the controller in a single DMA. These unusual SATA transfers expose problems with the firmware in some drives, including a number of Seagate models. The Linux sata_sil driver now has a quirks mode, which prevents writes larger than 15 blocks if the attached drive is on a blacklist. This workaround apparently gives stability at the cost of performance. The only other suggested fix seems to be "don't use a blacklisted drive and an SiL controller". :-( Here is one thread about this in the linux-kernel mailing list archives: http://lkml.org/lkml/2004/8/12/8 Regards, -- Neil Hoggarth Departmental Computing Manager <neil.hoggarth@physiol.ox.ac.uk> Laboratory of Physiology http://www.physiol.ox.ac.uk/~njh/ University of Oxford, UK
I'm seeing these problems with different drives on different machines (i386, amd64). I suspected, the problem may have something to do with the disk overheating, so we tried to shut the machine down for a day. Upon fresh start, the log is already full of errors -- several per minute. I tried using ataidle (from /usr/ports/sysutils/ataidle) to alter the drive's power consumption and/or acoustic level, but all attempts to do so got ad6: FAILURE - SETFEATURES 0x05 status=51<READY,DSC,ERROR> error=4<ABORTED> ad6: FAILURE - SETFEATURES 0x42 status=51<READY,DSC,ERROR> error=4<ABORTED> for all possible power and acoustic levels. Are these features supposed to work for SATA drives? The worst part is, however, that when the driver gives up (FAILURE). The programs may die (segfaulting vm_pager) or the OS can panic -- in filesystem code. May be, the driver should try to _really_ restart the drive -- by forcing it to spin down, wait, and spin up again? If NFS client code is willing to wait forever for the remote server to come back up, why can't ata? Yours, -mi
State Changed From-To: open->closed There is no idea in spinning down then up the drive it doesn't accomplish anything but drive bearing wear. The drive electronics needs to be reset and that has been done in ATA for ages. You should try the latest -current as and check what that brings, I'm using a sii3112 and problematic drives in a server here just to have real life testing, and that works, modulo the eventual timeouts thats expected.
> modulo the eventual timeouts thats expected. I am afraid, you misunderstood my suggestion. I think, the driver should not give up in case of a timeout. Some places in the kernel do not expect such a failure -- pager may panic if it can not write a page to swap, file systems may become corrupted if they can not flush cached data. If NFS client can (and by default -- will) wait forever for the server to come back, the ATA code should wait forever for the writing to succeed. It is difficult for me to try the new code, because the original motherboard, where we saw the problem has long been replaced and the other system no longer exibits the problem since I tinkered with PCI timings a little bit. -mi