Bug 62440

Summary: ATA problems
Product: Base System Reporter: Johan Pettersson <manlix>
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: 5.2-CURRENT   
Hardware: Any   
OS: Any   

Description Johan Pettersson 2004-02-06 16:10:14 UTC
I recompiled my world and kernel today. Sources from -CURRENT. 5 feb around 21 CET

Installed the new kernel and rebooted to single user mode for installing the world. When booting the kernel it hangs and the last messages printed is

ad4: 152627MB <ST3160023AS> [310101/16/63] at ata2-master UDMA100
SMP: AP CPU #1 Launched!
ad4: TIMEOUT - READ_DMA retrying (2 retries left) LBA=1
ad4: timeout sending command=c8
ad4: error issuing DMA command

Tried to set hw.ata.ata_dma to 0 but then it couldnt mount the rootfs, neither with the old or the new kernel. Running the my old kernel now from 29 jan. Anyone else that got this error? Attaches my dmesg log from the old kernel.

Copyright (c) 1992-2004 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 5.2-CURRENT #0: Thu Jan 29 09:55:30 CET 2004
    root@beard.demonized.net:/usr/obj/usr/src/sys/BEARD
Preloaded elf kernel "/boot/kernel/kernel" at 0xc0794000.
Preloaded elf module "/boot/kernel/vesa.ko" at 0xc079421c.
Preloaded elf module "/boot/kernel/linux.ko" at 0xc07942c8.
Preloaded elf module "/boot/kernel/if_sk.ko" at 0xc0794374.
Preloaded elf module "/boot/kernel/miibus.ko" at 0xc0794420.
Preloaded elf module "/boot/kernel/snd_emu10k1.ko" at 0xc07944cc.
Preloaded elf module "/boot/kernel/snd_pcm.ko" at 0xc079457c.
Preloaded elf module "/boot/kernel/uhid.ko" at 0xc0794628.
Preloaded elf module "/boot/kernel/ums.ko" at 0xc07946d4.
Preloaded elf module "/boot/kernel/acpi.ko" at 0xc079477c.
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Pentium(R) 4 CPU 2.40GHz (2398.85-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0xf29  Stepping = 9
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Hyperthreading: 2 logical CPUs
real memory  = 536018944 (511 MB)
avail memory = 518975488 (494 MB)
ACPI APIC Table: <A M I  OEMAPIC >
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
ioapic0 <Version 2.0> irqs 0-23 on motherboard
Pentium Pro MTRR support enabled
VESA: v3.0, 65536k memory, flags:0x1, mode table:0xc06d19a2 (1000022)
VESA: NVidia
npx0: [FAST]
npx0: <math processor> on motherboard
npx0: INT 16 interface
acpi0: <A M I OEMXSDT> on motherboard
pcibios: BIOS version 2.10
Using $PIR table, 14 entries at 0xc00f5100
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
acpi_cpu0: <CPU> on acpi0
acpi_cpu1: <CPU> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> at device 1.0 on pci0
pcib1: could not get PCI interrupt routing table for \\_SB_.PCI0.P0P1 - AE_NOT_FOUND
pci1: <ACPI PCI bus> on pcib1
pci1: <display, VGA> at device 0.0 (no driver attached)
uhci0: <Intel 82801EB (ICH5) USB controller USB-A> port 0xef00-0xef1f irq 16 at device 29.0 on pci0
usb0: <Intel 82801EB (ICH5) USB controller USB-A> on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: <Intel 82801EB (ICH5) USB controller USB-B> port 0xef20-0xef3f irq 19 at device 29.1 on pci0
usb1: <Intel 82801EB (ICH5) USB controller USB-B> on uhci1
usb1: USB revision 1.0
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
ums0: Logitech USB Mouse, rev 1.10/6.20, addr 2, iclass 3/1
ums0: 3 buttons and Z dir.
uhci2: <Intel 82801EB (ICH5) USB controller USB-C> port 0xef40-0xef5f irq 18 at device 29.2 on pci0
usb2: <Intel 82801EB (ICH5) USB controller USB-C> on uhci2
usb2: USB revision 1.0
uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
uhci3: <Intel 82801EB (ICH5) USB controller USB-D> port 0xef80-0xef9f irq 16 at device 29.3 on pci0
usb3: <Intel 82801EB (ICH5) USB controller USB-D> on uhci3
usb3: USB revision 1.0
uhub3: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub3: 2 ports with 2 removable, self powered
pci0: <serial bus, USB> at device 29.7 (no driver attached)
pcib2: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci2: <ACPI PCI bus> on pcib2
skc0: <3Com 3C940 Gigabit Ethernet> port 0xd800-0xd8ff mem 0xfeafc000-0xfeafffff irq 22 at device 5.0 on pci2
skc0: 3Com Gigabit LOM (3C940)
sk0: <Marvell Semiconductor, Inc. Yukon> on skc0
sk0: Ethernet address: 00:0c:6e:4e:e2:13
miibus0: <MII bus> on sk0
e1000phy0: <Marvell 88E1000 Gigabit PHY> on miibus0
e1000phy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX-FDX, auto
pcm0: <Creative EMU10K1> port 0xdf80-0xdf9f irq 21 at device 13.0 on pci2
pcm0: <SigmaTel STAC9708/11 AC97 Codec>
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH5 UDMA100 controller> port 0xfc00-0xfc0f,0-0x3,0-0x7,0-0x3,0-0x7 at device 31.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata0: [MPSAFE]
ata1: at 0x170 irq 15 on atapci0
ata1: [MPSAFE]
atapci1: <Intel ICH5 SATA150 controller> port 0xef60-0xef6f,0xefa8-0xefab,0xefa0-0xefa7,0xefac-0xefaf,0xefe0-0xefe7 irq 18 at device 31.2 on pci0
atapci1: [MPSAFE]
ata2: at 0xefe0 on atapci1
ata2: [MPSAFE]
ata3: at 0xefa0 on atapci1
ata3: [MPSAFE]
pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
acpi_button0: <Power Button> on acpi0
atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
sio0: configured irq 4 not in bitmap of probed irqs 0
sio0: port may not be enabled
sio0 port 0x3f8-0x3ff irq 4 on acpi0
sio0: type 16550A
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
sio1 port 0x2f8-0x2ff irq 3 on acpi0
sio1: type 16550A
fdc0: <Enhanced floppy controller (i82077, NE72065 or clone)> port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
ppc0 port 0x778-0x77b,0x378-0x37f irq 7 drq 3 on acpi0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/9 bytes threshold
ppbus0: <Parallel port bus> on ppc0
pmtimer0 on isa0
orm0: <Option ROM> at iomem 0xc0000-0xcafff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Timecounters tick every 10.000 msec
acd0: CDRW <LITE-ON LTR-24102B> at ata1-master UDMA33
ad4: 152627MB <ST3160023AS> [310101/16/63] at ata2-master UDMA100
SMP: AP CPU #1 Launched!
Mounting root from ufs:/dev/ad4s1a
Comment 1 Kris Kennaway freebsd_committer freebsd_triage 2004-02-09 04:47:26 UTC
Responsible Changed
From-To: freebsd-bugs->sos

Assign to maintainer
Comment 2 Søren Schmidt freebsd_committer freebsd_triage 2004-02-09 19:07:58 UTC
Responsible Changed
From-To: sos->bugs

This is not an ATA problem AFAICT. 
Try to make a kernel without APIC (does that work on SMP ?) since 
thats most likely the reason (ie bad interrupt routing).
Comment 3 bill fumerola freebsd_committer freebsd_triage 2004-02-10 01:44:30 UTC
Responsible Changed
From-To: bugs->freebsd-bugs

assign to proper catchall user
Comment 4 alo 2004-03-07 16:40:23 UTC
I decided to report my findings with this (I assume it is the same)
bug.

For me it happens only when I have a SATA drive attached to its
controller. What is more interesting, if I reboot the machine, detach
the SATA drive without switching power off, timeouts and failures keep
appearing (they appear on one particular ATA drive, not the SATA one)
until I make a proper power down/up sequence without the SATA drive
attached.

I have tried with both 5.2.1 kernel and -CURRENT. APIC is disabled (or
is it?):

/boot/device.hints:

hint.apic.0.disabled="1"

and in the -CURRENT kernel the apic is commented out.

Here is the kernel printf:

Copyright (c) 1992-2004 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.2-CURRENT #0: Sat Mar  6 19:56:32 EET 2004
alo@alo99.louko.com:/u4/alo/FreeBSD/sys-src-current/src/sys/i386/compile/ALOIFP2
Preloaded elf kernel "/boot/kernel/kernel" at 0xc0a4a000.
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Pentium(R) 4 CPU 2.40GHz (2400.09-MHz 686-class CPU)
Origin = "GenuineIntel"  Id = 0xf27  Stepping = 7
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
real memory  = 1073725440 (1023 MB)
avail memory = 1041203200 (992 MB)
Pentium Pro MTRR support enabled
npx0: [FAST]
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcibios: BIOS version 2.10
Found $PIR table, 11 entries at 0xc00f2320
pcib0: <Host to PCI bridge> at pcibus 0 on motherboard
pci0: <PCI bus> on pcib0
$PIR: 0:29 INTA routed to irq 11
$PIR: ROUTE_INTERRUPT failed.
$PIR: ROUTE_INTERRUPT failed.
$PIR: ROUTE_INTERRUPT failed.
$PIR: ROUTE_INTERRUPT failed.
$PIR: ROUTE_INTERRUPT failed.
agp0: <Intel Generic host to PCI bridge> mem 0xf0000000-0xf3ffffff at device 0.0 on pci0
pcib1: <PCIBIOS PCI-PCI bridge> mem 0xe8000000-0xebffffff at device 1.0 on pci0
pci1: <PCI bus> on pcib1
$PIR: 1:0 INTA routed to irq 11
pci1: <display, VGA> at device 0.0 (no driver attached)
uhci0: <Intel 82801DB (ICH4) USB controller USB-A> port 0xd800-0xd81f irq 11 at device 29.0 on pci0
usb0: <Intel 82801DB (ICH4) USB controller USB-A> on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
ums0: Logitech USB Receiver, rev 1.10/9.10, addr 2, iclass 3/1
ums0: 5 buttons and Z dir.
uhci1: <Intel 82801DB (ICH4) USB controller USB-B> port 0xd400-0xd41f irq 9 at device 29.1 on pci0
usb1: <Intel 82801DB (ICH4) USB controller USB-B> on uhci1
usb1: USB revision 1.0
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2: <Intel 82801DB (ICH4) USB controller USB-C> port 0xd000-0xd01f irq 9 at device 29.2 on pci0
usb2: <Intel 82801DB (ICH4) USB controller USB-C> on uhci2
usb2: USB revision 1.0
uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
pci0: <serial bus, USB> at device 29.7 (no driver attached)
pcib2: <PCIBIOS PCI-PCI bridge> at device 30.0 on pci0
pci2: <PCI bus> on pcib2
$PIR: 2:2 INTA routed to irq 9
$PIR: 2:3 INTA routed to irq 5
$PIR: 2:4 INTA routed to irq 9
$PIR: 2:5 INTA routed to irq 9
fxp0: <Intel 82550 Pro/100 Ethernet> port 0xb800-0xb83f mem 0xe3000000-0xe301ffff,0xe3800000-0xe3800fff irq 9 at device 2.0 on pci2
fxp0: Ethernet address 00:02:b3:c0:ca:3d
miibus0: <MII bus> on fxp0
inphy0: <i82555 10/100 media interface> on miibus0
inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fwohci0: <Texas Instruments TSB43AB22/A> mem 0xe2000000-0xe2003fff,0xe2800000-0xe28007ff irq 5 at device 3.0 on pci2
fwohci0: OHCI version 1.10 (ROM=1)
fwohci0: No. of Isochronous channel is 4.
fwohci0: EUI64 00:e0:18:00:00:15:a2:bb
fwohci0: Phy 1394a available S400, 2 ports.
fwohci0: Link S400, max_rec 2048 bytes.
firewire0: <IEEE1394(FireWire) bus> on fwohci0
fwe0: <Ethernet over FireWire> on firewire0
if_fwe0: Fake Ethernet address: 02:e0:18:15:a2:bb
sbp0: <SBP-2/SCSI over FireWire> on firewire0
fwohci0: Initiate bus reset
fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode
firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me)
firewire0: bus manager 0 (me)
atapci0: <SiI 3112 SATA150 controller> port 0xa000-0xa00f,0xa400-0xa403,0xa800-0xa807,0xb000-0xb003,0xb400-0xb407 mem 0xe1800000-0xe18001ff irq 9 at device 4.0 on pci2
atapci0: [MPSAFE]
ata2: at 0xe1800000 on atapci0
ata2: [MPSAFE]
ata3: at 0xe1800000 on atapci0
ata3: [MPSAFE]
bge0: <Broadcom BCM5702 Gigabit Ethernet, ASIC rev. 0x1002> mem 0xe1000000-0xe100ffff irq 9 at device 5.0 on pci2
bge0: Ethernet address: 00:0c:6e:0d:dc:e4
miibus1: <MII bus> on bge0
brgphy0: <BCM5703 10/100/1000baseTX PHY> on miibus1
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci1: <Intel ICH4 UDMA100 controller> port 0xf000-0xf00f,0-0x3,0-0x7,0-0x3,0-0x7 irq 9 at device 31.1 on pci0
ata0: at 0x1f0 irq 14 on atapci1
ata0: [MPSAFE]
ata1: at 0x170 irq 15 on atapci1
ata1: [MPSAFE]
pci0: <multimedia, audio> at device 31.5 (no driver attached)
orm0: <Option ROMs> at iomem 0xd8000-0xd97ff,0xd0000-0xd47ff,0xcc000-0xcffff,0xc0000-0xc8fff on isa0
pmtimer0 on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x64,0x60 on isa0
atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
fdc0: <Enhanced floppy controller (i82077, NE72065 or clone)> at port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/16 bytes threshold
ppbus0: <Parallel port bus> on ppc0
plip0: <PLIP network interface> on ppbus0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
unknown: <PNP0401> can't assign resources (port)
unknown: <PNP0501> can't assign resources (port)
unknown: <PNP0501> can't assign resources (port)
unknown: <PNP0700> can't assign resources (port)
unknown: <PNP0303> can't assign resources (port)
unknown: <PNP0c02> can't assign resources (port)
Timecounter "TSC" frequency 2400093668 Hz quality 800
Timecounters tick every 10.000 msec
ipfw2 initialized, divert enabled, rule-based forwarding enabled, default to deny, logging disabled
IPsec: Initialized Security Association Processing.
IP Filter: v3.4.31 initialized.  Default = block all, Logging = enabled
ad0: 117246MB <Maxtor 6Y120L0> [238216/16/63] at ata0-master UDMA100
ad1: 117800MB <IC35L120AVV207-1> [239340/16/63] at ata0-slave UDMA100
acd0: CDRW <HL-DT-ST RW/DVD GCC-4480B> at ata1-master UDMA33
ad4: 114473MB <ST3120026AS> [232581/16/63] at ata2-master UDMA100
cd0 at ata1 bus 0 target 0 lun 0
cd0: <HL-DT-ST RW/DVD GCC-4480B 1.00> Removable CD-ROM SCSI-0 device 
cd0: 33.000MB/s transfers
cd0: Attempt to query device size failed: NOT READY, Medium not present
Mounting root from ufs:/dev/ad0s1a
pid 521 (httpd), uid 0: exited on signal 11 (core dumped)
drm0: <Matrox G550 (AGP)> mem 0xe4800000-0xe4ffffff,0xe5000000-0xe5003fff,0xe6000000-0xe7ffffff irq 11 at device 0.0 on pci1
info: [drm] AGP at 0xf0000000 64MB
info: [drm] Initialized mga 3.1.0 20021029 on minor 0
drm0: [MPSAFE]
ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=159
ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=159
ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=159
ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=23191327
Comment 5 alo 2004-03-18 21:23:53 UTC
It seems that my problems were caused by a marginal (read: faulty,
SMART-errors etc.) Maxtor disk and/or SiI 3112 SATA150 controller. Now
I use a HighPoint SATA controller without any problems.
Comment 6 Mark Linimon freebsd_committer freebsd_triage 2004-06-30 04:42:18 UTC
State Changed
From-To: open->closed

Submitter notes that problem was due to faulty hardware.