A clean installation using FreeBSD media cause errors when DMA mode is used to access the IDE disks. messages during installation: ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=2570528 ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=2570624 ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=2574752 if system is installed using hw.ata.ata_dma=0 the following happens, when system is booted with DMA enabled: dc1: Ethernet address: 00:03:ba:0f:22:55 dc1: if_start running deferred for Giant dc1: [GIANT-LOCKED] pci0: <serial bus, USB> at device 10.0 (no driver attached) atapci0: <AcerLabs M5229 UDMA66 controller> port 0x10200-0x10207,0x10218-0x1021b,0x10210-0x10217,0x10208-0x1020b,0x10220-0x1022f at device 13.0 on pci0 ata2: <ATA channel 0> on atapci0 ata3: <ATA channel 1> on atapci0 stray level interrupt 14 rtc0: <Real Time Clock> at port 0x70-0x71 on isa0 uart0: <16550 or compatible> at port 0x3f8-0x3ff irq 43 on isa0 uart0: console (9600,n,8,1) uart1: <16550 or compatible> at port 0x2e8-0x2ef irq 43 on isa0 Timecounters tick every 1.000 msec ad0: 38166MB <Seagate ST340824A 3.28> at ata2-master UDMA66 acd0: CDRW <RICOH CD-R/RW MP7200A/1.30> at ata3-master UDMA33 Trying to mount root from ufs:/dev/ad0a /libexec/ld-elf.so.1: /lib/libncurses.so.5: invalid file format Enter full pathname of shell or RETURN for /bin/sh: or: dc1: Ethernet address: 00:03:ba:0f:22:55 dc1: if_start running deferred for Giant dc1: [GIANT-LOCKED] pci0: <serial bus, USB> at device 10.0 (no driver attached) atapci0: <AcerLabs M5229 UDMA66 controller> port 0x10200-0x10207,0x10218-0x1021b,0x10210-0x10217,0x10208-0x1020b,0x10220-0x1022f at device 13.0 on pci0 ata2: <ATA channel 0> on atapci0 ata3: <ATA channel 1> on atapci0 stray level interrupt 14 rtc0: <Real Time Clock> at port 0x70-0x71 on isa0 uart0: <16550 or compatible> at port 0x3f8-0x3ff irq 43 on isa0 uart0: console (9600,n,8,1) uart1: <16550 or compatible> at port 0x2e8-0x2ef irq 43 on isa0 Timecounters tick every 1.000 msec ad0: 38166MB <Seagate ST340824A 3.28> at ata2-master UDMA66 acd0: CDRW <RICOH CD-R/RW MP7200A/1.30> at ata3-master UDMA33 Trying to mount root from ufs:/dev/ad0a init in malloc(): error: recursive call init in malloc(): error: recursive call init in malloc(): error: recursive call init in malloc(): error: recursive call init in malloc(): error: recursive call init in malloc(): error: recursive call init in malloc(): error: recursive call init in malloc(): error: recursive call init in malloc(): error: recursive call init in malloc(): error: recursive call init in malloc(): error: recursive call init in malloc(): error: recursive call system is working fine with hw.ata.ata_dma=0: dc1: Ethernet address: 00:03:ba:0f:22:55 dc1: if_start running deferred for Giant dc1: [GIANT-LOCKED] pci0: <serial bus, USB> at device 10.0 (no driver attached) atapci0: <AcerLabs M5229 UDMA66 controller> port 0x10200-0x10207,0x10218-0x1021b,0x10210-0x10217,0x10208-0x1020b,0x10220-0x1022f at device 13.0 on pci0 ata2: <ATA channel 0> on atapci0 ata3: <ATA channel 1> on atapci0 rtc0: <Real Time Clock> at port 0x70-0x71 on isa0 uart0: <16550 or compatible> at port 0x3f8-0x3ff irq 43 on isa0 uart0: console (9600,n,8,1) uart1: <16550 or compatible> at port 0x2e8-0x2ef irq 43 on isa0 Timecounters tick every 1.000 msec ad0: 38166MB <Seagate ST340824A 3.28> at ata2-master PIO4 acd0: CDRW <RICOH CD-R/RW MP7200A/1.30> at ata3-master UDMA33 Trying to mount root from ufs:/dev/ad0a Loading configuration files. Entropy harvesting: interrupts ethernet point_to_point kickstart. swapon: adding /dev/ad0b as swap device Starting file system checks: /dev/ad0a: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad0a: clean, 102079 free (975 frags, 12638 blocks, 0.8% fragmentation) /dev/ad0e: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad0e: clean, 127341 free (29 frags, 15914 blocks, 0.0% fragmentation) /dev/ad0f: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad0f: clean, 17986047 free (4295 frags, 2247719 blocks, 0.0% fragmentation) /dev/ad0d: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad0d: clean, 127258 free (42 frags, 15902 blocks, 0.0% fragmentation) Fix: If DMA is not used (hw.ata.ata_dma=0 in bootloader) the messages go away and access to HDD is possible without errors, but only in PIO4. How-To-Repeat: Try to access IDE drives in a Sun Netra X1 using DMA mode. Tested FreeBSD installation media 5.3-RELEASE and 6.0-CURRENT-SNAP004. Earlier releases no testet.
On Wed, Jun 15, 2005 at 09:14:32AM +0000, Sebastian Koehler wrote: > > A clean installation using FreeBSD media cause errors when DMA mode is used to access the IDE disks. > > messages during installation: > ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=2570528 > ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=2570624 > ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=2574752 > Is this with original drives from Sun or with vanilla off-the-shelf ones?
The drive got a yellow Sun label on it with P/N 370-4419-01. It's a 40GB Seagate, model ST340824A.
Hi list, got the attached patch from Marius Strobl. Unfortunately problem was not fixed trough it. When ata_generic_reset(dev) is below if (ctlr...) in ata_ali_reset() and system is booting up with hw.ata.ata_dma=1 still data corruption occurs. See the next lines for details. ... Additional routing options:. Starting devd. Mounting NFS file systems:. Creating and/or trimming log files:. Starting syslogd. Checking for core dump on /dev/ad0b... /libexec/ld-elf.so.1: /lib/libz.so.2: invalid file format ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/local/lib Starting local daemons:. ... sunshine# sysctl -a | grep ata_dma hw.ata.ata_dma: 1 sunshine# sunshine# dmesg ... uart1: <16550 or compatible> at port 0x2e8-0x2ef irq 43 on isa0 Timecounters tick every 1.000 msec ad0: 38166MB <Seagate ST340824A 3.28> at ata2-master UDMA66 Trying to mount root from ufs:/dev/ad0a ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=256 ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=394976 ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=3635040 dc0: failed to force tx and rx to idle state dc0: failed to force tx and rx to idle state dc0: failed to force tx and rx to idle state dc0: failed to force tx and rx to idle state ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=224 Sebastian
I just wanted to report that I experienced the same problem with my Sun Blade 100 system (using both the stock Sun hard drive and two off-the-shelf drives, one Seagate 7200.7 ATA100 and an older WD 8GB) running 5.4-RELEASE. I'm guessing this impacts all Blade 100/150, Netra X1 and Fire V100 servers as they all seem to use the same chipset (give or take a little bit). Setting hw.ata.ata_dma to 0 does fix the problem with my Blade 100 (and Blade 150, nearly if not identical motherboard but US-IIi instead of US-IIe). I haven't run into the same problem while running 5.4-RELEASE or 6.0-BETA on a Sun Ultra 10 using a Seagate Barracuda IV 40GB drive; probably because it uses a different ATA controller (which isn't great to begin with). -- Linh Pham question@closedsrc.org http://closedsrc.org/
On Wed, Jun 15, 2005 at 09:14:32AM +0000, Sebastian Koehler wrote: > > >Number: 82261 > >Category: sparc64 > >Synopsis: DMA-support on Sparc64 broken > >Confidential: no > >Severity: serious > >Priority: high > >Responsible: freebsd-sparc64 > >State: closed > >Quarter: > >Keywords: > >Date-Required: > >Class: sw-bug > >Submitter-Id: current-users > >Arrival-Date: Wed Jun 15 09:20:16 GMT 2005 > >Closed-Date: > >Last-Modified: > >Originator: Sebastian Koehler > >Release: 6.0-CURRENT-SNAP004 > >Organization: > >Environment: > FreeBSD 6.0-20050601-SNAP FreeBSD 6.0-20050601-SNAP #0: Thu Jun 2 05:29:17 UTC 2005 root@u60.samsco.home:/usr/obj/usr/src/sys/GENERIC sparc64 > >Description: > A clean installation using FreeBSD media cause errors when DMA mode is used to access the IDE disks. > > messages during installation: > ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=2570528 > ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=2570624 > ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=2574752 > > if system is installed using hw.ata.ata_dma=0 the following happens, when system is booted with DMA enabled: > dc1: Ethernet address: 00:03:ba:0f:22:55 > dc1: if_start running deferred for Giant > dc1: [GIANT-LOCKED] > pci0: <serial bus, USB> at device 10.0 (no driver attached) > atapci0: <AcerLabs M5229 UDMA66 controller> port 0x10200-0x10207,0x10218-0x1021b,0x10210-0x10217,0x10208-0x1020b,0x10220-0x1022f at device 13.0 on pci0 > ata2: <ATA channel 0> on atapci0 > ata3: <ATA channel 1> on atapci0 > stray level interrupt 14 > rtc0: <Real Time Clock> at port 0x70-0x71 on isa0 > uart0: <16550 or compatible> at port 0x3f8-0x3ff irq 43 on isa0 > uart0: console (9600,n,8,1) > uart1: <16550 or compatible> at port 0x2e8-0x2ef irq 43 on isa0 > Timecounters tick every 1.000 msec > ad0: 38166MB <Seagate ST340824A 3.28> at ata2-master UDMA66 > acd0: CDRW <RICOH CD-R/RW MP7200A/1.30> at ata3-master UDMA33 > Trying to mount root from ufs:/dev/ad0a > /libexec/ld-elf.so.1: /lib/libncurses.so.5: invalid file format > Enter full pathname of shell or RETURN for /bin/sh: > > or: > dc1: Ethernet address: 00:03:ba:0f:22:55 > dc1: if_start running deferred for Giant > dc1: [GIANT-LOCKED] > pci0: <serial bus, USB> at device 10.0 (no driver attached) > atapci0: <AcerLabs M5229 UDMA66 controller> port 0x10200-0x10207,0x10218-0x1021b,0x10210-0x10217,0x10208-0x1020b,0x10220-0x1022f at device 13.0 on pci0 > ata2: <ATA channel 0> on atapci0 > ata3: <ATA channel 1> on atapci0 > stray level interrupt 14 > rtc0: <Real Time Clock> at port 0x70-0x71 on isa0 > uart0: <16550 or compatible> at port 0x3f8-0x3ff irq 43 on isa0 > uart0: console (9600,n,8,1) > uart1: <16550 or compatible> at port 0x2e8-0x2ef irq 43 on isa0 > Timecounters tick every 1.000 msec > ad0: 38166MB <Seagate ST340824A 3.28> at ata2-master UDMA66 > acd0: CDRW <RICOH CD-R/RW MP7200A/1.30> at ata3-master UDMA33 > Trying to mount root from ufs:/dev/ad0a > init in malloc(): error: recursive call > init in malloc(): error: recursive call > init in malloc(): error: recursive call > init in malloc(): error: recursive call > init in malloc(): error: recursive call > init in malloc(): error: recursive call > init in malloc(): error: recursive call > init in malloc(): error: recursive call > init in malloc(): error: recursive call > init in malloc(): error: recursive call > init in malloc(): error: recursive call > init in malloc(): error: recursive call > > system is working fine with hw.ata.ata_dma=0: > dc1: Ethernet address: 00:03:ba:0f:22:55 > dc1: if_start running deferred for Giant > dc1: [GIANT-LOCKED] > pci0: <serial bus, USB> at device 10.0 (no driver attached) > atapci0: <AcerLabs M5229 UDMA66 controller> port 0x10200-0x10207,0x10218-0x1021b,0x10210-0x10217,0x10208-0x1020b,0x10220-0x1022f at device 13.0 on pci0 > ata2: <ATA channel 0> on atapci0 > ata3: <ATA channel 1> on atapci0 > rtc0: <Real Time Clock> at port 0x70-0x71 on isa0 > uart0: <16550 or compatible> at port 0x3f8-0x3ff irq 43 on isa0 > uart0: console (9600,n,8,1) > uart1: <16550 or compatible> at port 0x2e8-0x2ef irq 43 on isa0 > Timecounters tick every 1.000 msec > ad0: 38166MB <Seagate ST340824A 3.28> at ata2-master PIO4 > acd0: CDRW <RICOH CD-R/RW MP7200A/1.30> at ata3-master UDMA33 > Trying to mount root from ufs:/dev/ad0a > Loading configuration files. > Entropy harvesting: interrupts ethernet point_to_point kickstart. > swapon: adding /dev/ad0b as swap device > Starting file system checks: > /dev/ad0a: FILE SYSTEM CLEAN; SKIPPING CHECKS > /dev/ad0a: clean, 102079 free (975 frags, 12638 blocks, 0.8% fragmentation) > /dev/ad0e: FILE SYSTEM CLEAN; SKIPPING CHECKS > /dev/ad0e: clean, 127341 free (29 frags, 15914 blocks, 0.0% fragmentation) > /dev/ad0f: FILE SYSTEM CLEAN; SKIPPING CHECKS > /dev/ad0f: clean, 17986047 free (4295 frags, 2247719 blocks, 0.0% fragmentation) > /dev/ad0d: FILE SYSTEM CLEAN; SKIPPING CHECKS > /dev/ad0d: clean, 127258 free (42 frags, 15902 blocks, 0.0% fragmentation) > >How-To-Repeat: > Try to access IDE drives in a Sun Netra X1 using DMA mode. Tested FreeBSD installation media 5.3-RELEASE and 6.0-CURRENT-SNAP004. Earlier releases no testet. > >Fix: > If DMA is not used (hw.ata.ata_dma=0 in bootloader) the messages go away and access to HDD is possible without errors, but only in PIO4. Søren, could you please look into this? AFAIK you also have a Sun Netra X1. Like a couple of other Sun models these use an onboard AcerLabs M5229 rev. 0xc3 and at least the 'TIMEOUT - WRITE_DMA retrying' warnings haven been reported for pretty much all of them, it seems much less likely to experience them with the original Sun supplied disks though. On the other hand there are a few reports like <200508071916.50197.Chris@LainOS.org> on freebsd-current@ and this PR that the ATA disks aren't useable at all. The problems seem to have started some time in the earlier 5.x days but an exact date isn't know and are still persistent after ATA mkIII. AFAICT the problems are also limited to UDMA66 and don't happen when restricting to UDMA33. Given that this also affects a couple of other models like the AX1105, Blade 100, Fire V100, etc. and it's not possible to plug in another controller on some of them this unfortunately is a show-stopper type of problem. The AcerLabs M5229 rev. 0xc3 are also know to suffer from a silicon bug that can cause data corruption but which doesn't seem to be the cause of the above mentioned problems (the workaround is to disable and re-enable the respective channel via the IDE interface control register of the accompanying ISA bridge on reset, see the audit-trail of the PR for a patch; the info is from OpenSolaris and an equivalent patch was also incorporated into NetBSD). The data corruption issue has been seen under FreeBSD in the past before other issues like the WRITE_DMA timeouts occured and only seems to happen ocassionally but not cause permanent problems like the other problems. Thanks.
Responsible Changed From-To: freebsd-sparc64->sos Over to sos.
State Changed From-To: open->analyzed Hmm, my (only) SUN is a Netra T1 which has the c3 step 5229 chip as well, however it doesn't seem to have any problems at all using ATA66 disks, so at least the problem is not "universal". Anyhow I seem to recall a SUN enginer once telling me that the did change some HW to make it less "friendly" to non-SUN supplied disks but if forgot the details. I guess the best fix would be to simply disallow modes beyound ATA33 on SUN hardware as it seems that would allow at least some DMA on those affected system, or did I get that wrong ?
> Hmm, my (only) SUN is a Netra T1 which has the c3 step 5229 chip as well, > however it doesn't seem to have any problems at all using ATA66 disks, so > at least the problem is not "universal". By "ATA66 disks" you mean you also gave it a try with non-Sun supplied disks in addition to the originally supplied one? > Anyhow I seem to recall a SUN enginer once telling me that the did change > some HW to make it less "friendly" to non-SUN supplied disks but if forgot > the details. > I guess the best fix would be to simply disallow modes beyound ATA33 on SUN > hardware as it seems that would allow at least some DMA on those affected > system, or did I get that wrong ? From my own experience and other reports I can say that the WRITE_DMA timeouts (which in general don't seem to be fatal, they're just reported over and over again) vanish when limiting to ATA33 but I don't know about the particular problem of this PR (permanent data corruption). Sebastian, could you please test what happens if you just limit to ATA33 instead of disabling DMA completely? The simplest approach to achieve this probably is to replace the 80-pin cable with a 40-pin one. As for disallowing modes beyond ATA33 on Sun hardware in general as a "fix" for the WRITE_DMA timeouts (and maybe this PR) is pretty gross as there don't seem to be problems with ATA66 and up when putting PCI ATA add-on cards into sparc64 machines (modulo stuff that needs the firmware on these cards to be executed). So restricting the ATA33 limitation on sparc64 to the onboard M5229 rev. 0xc3 would be desirable. Also as most of the onboard M5229 seem to work just fine at ATA66 on sparc64 when using Sun-supplied disks modulo the occasional data corruption due to the silicon bug it would be nice to default to ATA33 with these controllers but make this limitation easily overideable (or maybe don't limit to ATA33 by default but throttle the DMA mode in case a WRITE_DMA timeout occurs). Marius
It is exactly as you described, Marius. When I use an old 40 pin IDE cable, the system is working fine. Please see dmesg for details: uhub0: 2 ports with 2 removable, self powered atapci0: <AcerLabs M5229 UDMA66 controller> port 0x10200-0x10207,0x10218-0x1021b,0x10210-0x10217,0x10208-0x1020b,0x10220-0x1022f at device 13.0 on pci0 ata2: <ATA channel 0> on atapci0 ata3: <ATA channel 1> on atapci0 nexus0: <syscons>, type (unknown) (no driver attached) rtc0: <Real Time Clock> at port 0x70-0x71 pnpid @@Kd041 on isa0 uart0: <16550 or compatible> at port 0x3f8-0x3ff irq 43 pnpid @HEd041 on isa0 uart0: console (9600,n,8,1) uart1: <16550 or compatible> at port 0x2e8-0x2ef irq 43 pnpid @HEd041 on isa0 Timecounters tick every 1.000 msec ad0: DMA limited to UDMA33, controller found non-ATA66 cable ad0: 38166MB <Seagate ST340824A 3.28> at ata2-master UDMA33 Trying to mount root from ufs:/dev/ad0a dc0: failed to force tx and rx to idle state dc0: failed to force tx and rx to idle state dc0: failed to force tx and rx to idle state dc0: failed to force tx and rx to idle state dc1: link state changed to UP At this point system is able to access disks via UDMA33 and no data get lost. I'm just wondering that Solaris has no problem accessing the disk via UDMA66. Hope you can send me a patch for testing purposes or commit it into HEAD. Best Regards, Sebastian
Hope you can find something in the output. # pciconf -l -v isab0@pci0:7:0: class=0x060100 card=0x153310b9 chip=0x153310b9 rev=0x00 hdr=0x00 vendor = 'Acer Labs Incorporated (ALi)' device = 'ALI M1533 Aladdin IV ISA Bridge' class = bridge subclass = PCI-ISA none0@pci0:3:0: class=0x000000 card=0x00000000 chip=0x710110b9 rev=0x00 hdr=0x00 vendor = 'Acer Labs Incorporated (ALi)' device = 'ALI M7101 Power Management Controller' class = old subclass = non-VGA display device none1@pci0:3:0: class=0x000000 card=0x00000000 chip=0x710110b9 rev=0x00 hdr=0x00 vendor = 'Acer Labs Incorporated (ALi)' device = 'ALI M7101 Power Management Controller' class = old subclass = non-VGA display device dc0@pci0:12:0: class=0x020000 card=0x00000000 chip=0x91021282 rev=0x31 hdr=0x00 vendor = 'Davicom Semiconductor Inc.' device = 'DM9102/A/AF Dell 4300S - CNET Pro200WL Ethernet Adapter' class = network subclass = ethernet dc1@pci0:5:0: class=0x020000 card=0x00000000 chip=0x91021282 rev=0x31 hdr=0x00 vendor = 'Davicom Semiconductor Inc.' device = 'DM9102/A/AF Dell 4300S - CNET Pro200WL Ethernet Adapter' class = network subclass = ethernet ohci0@pci0:10:0: class=0x0c0310 card=0x00000000 chip=0x523710b9 rev=0x03 hdr=0x00 vendor = 'Acer Labs Incorporated (ALi)' device = 'M5237 OpenHCI 1.1 USB Controller' class = serial bus subclass = USB atapci0@pci0:13:0: class=0x0101ff card=0x00000000 chip=0x522910b9 rev=0xc3 hdr=0x00 vendor = 'Acer Labs Incorporated (ALi)' device = 'M1543 Southbridge EIDE Controller' class = mass storage subclass = ATA # pciconf -r -b atapci0@pci0:13:0: 0:255 b9 10 29 52 05 00 90 02 c3 ff 01 01 00 10 00 00 01 02 01 00 19 02 01 00 11 02 01 00 09 02 01 00 21 02 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 60 00 00 00 00 00 00 00 0c 01 02 04 06 00 00 7f 00 00 00 00 00 02 00 c9 00 80 ba 1a 03 00 00 81 50 55 44 44 01 00 31 00 03 00 00 00 01 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Here is the output of the registers at different modes. PIO4: b9 10 29 52 05 00 90 02 c3 ff 01 01 00 10 00 00 01 02 01 00 19 02 01 00 11 02 01 00 09 02 01 00 21 02 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 60 00 00 00 00 00 00 00 0c 01 02 04 06 00 00 7f 00 00 00 00 00 02 01 c9 00 80 ba 1a 03 00 00 81 50 55 44 44 01 00 31 00 03 00 00 00 01 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 UDMA33: b9 10 29 52 05 00 90 02 c3 ff 01 01 00 10 00 00 01 02 01 00 19 02 01 00 11 02 01 00 09 02 01 00 21 02 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 60 00 00 00 00 00 00 00 0c 01 02 04 06 00 00 7f 00 00 00 00 00 02 01 c9 00 80 ba 1a 03 00 00 81 55 55 4a 44 01 00 31 00 03 00 00 00 01 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 UDMA66: b9 10 29 52 05 00 90 02 c3 ff 01 01 00 10 00 00 01 02 01 00 19 02 01 00 11 02 01 00 09 02 01 00 21 02 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 60 00 00 00 00 00 00 00 0c 01 02 04 06 00 00 7f 00 00 00 00 00 02 00 c9 00 80 ba 1a 03 00 00 81 55 55 48 44 01 00 31 00 03 00 00 00 01 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
On 16/08/2005, at 21:10, Sebastian Koehler wrote: > Here is the output of the registers at different modes. (dumps deleted) Looks just as they should and exactly as what I see on my (working) =20 Netra T1, which is what I expected but better safe.... I can only assume that SUN did something "smart" to the HW that you =20 need to know about to make things work. Someone with access to the =20 failing HW and the proper measurement / testing equipment need to dig =20= in there and figure out how/what they changed. We can lobotomize ATA to only use at max UDMA33 on this Acer chip on =20 the sparc platform as a workaround, ugly but functional :) - S=F8ren
I've noticed your changes commited to HEAD. I've rebuilded world and kernel with sources from today. Your workaround did not resolve issue completely but changed it slightly. When the 80 pin cable is used timeouts happen only from time to time but disk access result in error messages. Attached some samples for you. 1st sample: GDB: no debug ports present KDB: debugger backends: ddb KDB: current backend: ddb Copyright (c) 1992-2005 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 7.0-CURRENT #5: Fri Aug 19 16:14:20 CEST 2005 root@:/usr/obj/usr/src/sys/GENERIC WARNING: WITNESS option enabled, expect reduced performance. Timecounter "tick" frequency 500000000 Hz quality 1000 real memory = 536870912 (512 MB) avail memory = 505454592 (482 MB) cpu0: Sun Microsystems UltraSparc-IIe Processor (500.00 MHz CPU) nexus0: <Open Firmware Nexus device> pcib0: <U2P UPA-PCI bridge> on nexus0 pcib0: Sabre, impl 0, version 0, ign 0x7c0, bus A pcib0: [FAST] pcib0: [GIANT-LOCKED] pcib0: [FAST] pcib0: [GIANT-LOCKED] pcib0 dvma: DVMA map: 0x60000000 to 0x63ffffff pci0: <OFW PCI bus> on pcib0 isab0: <PCI-ISA bridge> at device 7.0 on pci0 isa0: <ISA bus> on isab0 pci0: <old, non-VGA display device> at device 3.0 (no driver attached) pci0: <old, non-VGA display device> at device 3.0 (no driver attached) dc0: <Davicom DM9102A 10/100BaseTX> port 0x10000-0x100ff at device 12.0 on pci0 miibus0: <MII bus> on dc0 ukphy0: <Generic IEEE 802.3u media interface> on miibus0 ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto dc0: Ethernet address: 00:03:ba:0f:22:55 dc1: <Davicom DM9102A 10/100BaseTX> port 0x10100-0x101ff mem 0x2000-0x20ff at device 5.0 on pci0 miibus1: <MII bus> on dc1 ukphy1: <Generic IEEE 802.3u media interface> on miibus1 ukphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto dc1: Ethernet address: 00:03:ba:0f:22:55 ohci0: <AcerLabs M5237 (Aladdin-V) USB controller> mem 0x1000000-0x1000fff at device 10.0 on pci0 ohci0: [GIANT-LOCKED] usb0: OHCI version 1.0, legacy support usb0: <AcerLabs M5237 (Aladdin-V) USB controller> on ohci0 usb0: USB revision 1.0 uhub0: AcerLabs OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered atapci0: <AcerLabs M5229 UDMA66 controller> port 0x10200-0x10207,0x10218-0x1021b,0x10210-0x10217,0x10208-0x1020b,0x10220-0x1022f at device 13.0 on pci0 atapci0: using PIO transfers above 137GB as workaround for 48bit DMA access bug, expect reduced performance ata2: <ATA channel 0> on atapci0 ata3: <ATA channel 1> on atapci0 nexus0: <syscons>, type (unknown) (no driver attached) rtc0: <Real Time Clock> at port 0x70-0x71 pnpid @@Kd041 on isa0 uart0: <16550 or compatible> at port 0x3f8-0x3ff irq 43 pnpid @HEd041 on isa0 uart0: console (9600,n,8,1) uart1: <16550 or compatible> at port 0x2e8-0x2ef irq 43 pnpid @HEd041 on isa0 Timecounters tick every 1.000 msec ad0: 38166MB <Seagate ST340824A 3.28> at ata2-master UDMA66 Trying to mount root from ufs:/dev/ad0a Aug 20 01:17:33 init: can't exec /bin/sh for /etc/rc: Exec format error 2nd sample: Timecounters tick every 1.000 msec ad0: 38166MB <Seagate ST340824A 3.28> at ata2-master UDMA66 Trying to mount root from ufs:/dev/ad0a ... /dev/ad0d: clean, 96562 free (154 frags, 12051 blocks, 0.1% fragmentation) ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=384 Setting hostname: sunshine.thrillkill.lan. dc0: failed to force tx and rx to idle state ... Creating and/or trimming log files:. Starting syslogd. ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=2839808 add net default: gateway 192.168.25.254 ... Starting sshd. g_vfs_done():ad0a[READ(offset=8070450532280647680, length=16384)]error = 5 vnode_pager_getpages: I/O read error vm_fault: pager read error, pid 358 (sshd) ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=47264 Segmentation fault (core dumped) g_vfs_done():ad0a[READ(offset=8070450532280647680, length=16384)]error = 5 vnode_pager_getpages: I/O read error vm_fault: pager read error, pid 363 (sendmail) Segmentation fault Best Regards, Sebastian
On Tue, Aug 16, 2005 at 03:24:19PM +0200, Sebastian Koehler wrote: > > At this point system is able to access disks via UDMA33 and no data get > lost. I'm just wondering that Solaris has no problem accessing the disk > via UDMA66. Hope you can send me a patch for testing purposes or commit > it into HEAD. > Well, looks like I found the cause of the endless "TIMEOUT - WRITE_DMA retrying" messages I'm seeing at ATA66. AFAICT the firmware initializes the M5229 to use the ATA66 byte counter (probably also in the higher modes on PATA chips newer than 0xc3) instead of triggering an interrupt at the zero count of the transfer buffer counter which is not what we want in ata(4) as it doesn't use the ATA66 byte counter. This involves some guesswork as I don't have datasheets for these chips to back this up. Soeren do you have them and can confirm this? With the attached patch (against latest -current) I no longer can provoke the mentioned messages. Sebastian, could you please give it a try whether it also solves your problems at ATA66? Marius
With the patch error messages and the data corruption are gone in UDMA66 mode. I think you've got the right idea, even though without whitepapers. :) Best Regards, Sebastian
On Sat, Aug 20, 2005 at 12:47:36PM +0200, Sebastian Koehler wrote: > With the patch error messages and the data corruption are gone in UDMA66 > mode. I think you've got the right idea, even though without whitepapers. :) > Excellent. Soeren, could you please approve this patch or in case you don't like something about the changes (style, comments, ...) modify them accordingly and commit them yourself? Marius
On 20/08/2005, at 13:46, Marius Strobl wrote: > On Sat, Aug 20, 2005 at 12:47:36PM +0200, Sebastian Koehler wrote: > >> With the patch error messages and the data corruption are gone in =20 >> UDMA66 >> mode. I think you've got the right idea, even though without =20 >> whitepapers. :) >> >> > > Excellent. Soeren, could you please approve this patch or in case you > don't like something about the changes (style, comments, ...) modify > them accordingly and commit them yourself? I'll look at it, but I'm busy this weekend. I have docs for the chip =20 so.... I'd like you guy to test it *without* the reset hack, as that should =20 not be needed the way we do things... - S=F8ren
> Yes, the above snippet was from my original version of the patch > (the first in the audit trail in the PR) where ata_ali_reset() was > hooked up for all flavours of Acer chips. For testing your previous > version I just changed the chiprev to be checked for <= 0xc3 which > than again would only apply to 0xc2 and 0xc3 as ata_ali_reset() > is only called for ALINEW in your version... > Anyway, your new version works fine here, thanks! Patch is also working for me and my Netra X1. Thank you! Best Regards,
State Changed From-To: analyzed->closed Fix has been committed based on the code here and Acer docs.