| Summary: | Second SCSI Adapter (Adaptec 39160) still hanging system | ||
|---|---|---|---|
| Product: | Base System | Reporter: | James F. Hranicky <jfh> |
| Component: | kern | Assignee: | freebsd-bugs (Nobody) <bugs> |
| Status: | Closed FIXED | ||
| Severity: | Affects Only Me | ||
| Priority: | Normal | ||
| Version: | 4.2-STABLE | ||
| Hardware: | Any | ||
| OS: | Any | ||
I've discovered a few things:
- if I disable either of the internal SCSI busses, the machine
will boot normally. Is there something interesting about 4 SCSI
busses in one machine?
- booting from -CURRENT (20001218) floppies makes it through the
boot process until after the "Waiting 15 seconds for SCSI devices
to settle" line, then gives the same timeout errors as a 4.2-STABLE
SMP kernel. In other words, it doesn't hang after probing the ||
port like the non-SMP 4.2 kernel does.
- enabling CAMDEBUG in the 4.2-STABLE SMP kernel has given me the
following
messages:
-----------------------------------------------------------------------
[ trimmed ]
(probe14:ahc3:0:15:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe14:ahc3:0:15:0): ahc_action
(probe0:ahc3:0:0:0): ahc_action
(probe1:ahc3:0:1:0): SCB 0x9 - timed out while idle, SEQADDR == 0x3e
STACK == 0x0, 0x0, 0x0, 0x1
SXFRCTL0 == 0x80
SCB count = 20
QINFIFO entries: 9 8 7 6 5 4 3 2 1 0 19 18 17 16 15
Waiting Queue entries:
Disconnected Queue entries:
QOUTFIFO entries:
Sequencer Free SCB List: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
19 20 21 22 23 24 25 26 27 28 29 30 31
Pending list: 15 16 17 18 19 0 1 2 3 4 5 6 7 8 9
Kernel Free SCB list: 13 12 11 10
Untagged Q(0): 15
Untagged Q(1): 9
Untagged Q(2): 8
Untagged Q(3): 7
Untagged Q(4): 6
Untagged Q(5): 5
Untagged Q(6): 4
Untagged Q(8): 3
Untagged Q(9): 2
Untagged Q(10): 1
Untagged Q(11): 0
Untagged Q(12): 19
Untagged Q(13): 18
Untagged Q(14): 17
Untagged Q(15): 16
sg[0] - Addr 0xd6eb284 : Length 36
(probe1:ahc3:0:1:0): SCB 9: Immediate reset. Flags = 0x6040
(probe1:ahc3:0:1:0): ahc_done - scb 9
(probe1:ahc3:0:1:0): no longer in timeout, status = 34b
(probe1:ahc3:0:1:0): xpt_done
(probe2:ahc3:0:2:0): ahc_done - scb 8
(probe2:ahc3:0:2:0): xpt_done
-----------------------------------------------------------------------
Sometimes the timeout is in xpt_release_path as well as ahc_action:
-------
[...]
(xpt0:ahc3:0:1:0): xpt_release_path
(probe2:ahc3:0:2:0): SCB 0xe - timed out while idle, SEQADDR == 0x3e
[...]
-------
- I tried #defining AHC_DEBUG in the
/sys/dev/aic7xxx/aic7xxx_freebsd.c,
but I got several compiler errors, a couple I could fix (with an
extern int
declaraion), and some I couldn't:
o in function /sys/dev/aic7xxx/aic7xxx.c:ahc_calc_residual, there is
a reference to an "ahc" variable within the #define AHC_DEBUG block
that isn't defined in the function itself
At this point, I do have a workaround: disable channel A on the internal
SCSI
bus and hook the boot drive to the new card. This should also help
prevent
the renumbering of my boot disk as I add drives to the external busses (
I suppose this happens because it gets probed first (why?)). For a
while, though,
(probably until the second week of Jan) I do have time to help someone
do some
debugging on this problem if anyone's interested.
>>Synopsis: Second SCSI Adapter (Adaptec 39160) still hanging system ... >eisa0: <EISA bus> on motherboard >eisa0: unknown card @@@0000 (0x00000000) at slot 2 Disable eisa support in your kernel config and your hang will likely go away. The problem with EISA probes will likely be fixed in -current soon. I'm not sure if those changes will be ported back to -stable as they may be somewhat invasive. -- Justin Disabling eisa support appears to have fixed the problem. This PR can be closed out. Thanks very much for the help. Jim State Changed From-To: open->closed Problem fixed by disabling EISA support, so closed at submitters request. |
[ Note: I sent a message to freebsd-questions earlier, but have since done more work, and feel a PR is necessary --jfh ] Installation of second SCSI adapter (Adaptec 39160) hangs the system. Here's a short synopsis of things tried: Initial installation: - upon installing the new card and booting, it was discovered that the system was hung part of the way through boot, after the probe of the SCSI adapters, but before the probe of the disks. Upon further investigation it was discovered that the first adapter was sharing IRQ 11 for both channels, and the second adapter was taking IRQs 5 and 11 for its channels. At first, an IRQ conflict was suspected. The boot process halts after the probe of the parallel port: plip0: <PLIP network interface> on ppbus0 lpt0: <Printer> on ppbus0 lpt0: Interrupt-driven port ppi0: <Parallel I/O> on ppbus0 At this point, scroll lock and serial break have no effect, while <Ctrl><Alt><Del> (after plugging the keyboard bak in) does boot the machine - After searching the archives, I discovered that FreeBSD could use IRQs higher than 15 by enabling APIC_IO (which appears to require SMP support), so I compiled an SMP kernel for the machine. Here are the (edited/annotated slightly) boot messages for that kernel: ----------------------------------------------------------------- FreeBSD 4.2-STABLE #0: Fri Dec 15 13:29:02 EST 2000 root@palm.cise.ufl.edu:/private/freebsd-src/src/sys/compile/CISEKERN.SMP Timecounter "i8254" frequency 1193182 Hz CPU: Pentium III/Pentium III Xeon/Celeron (646.67-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x683 Stepping = 3 Features=0x387fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE> real memory = 268369920 (262080K bytes) avail memory = 256520192 (250508K bytes) Programming 24 pins in IOAPIC #0 IOAPIC #0 intpin 2 -> irq 0 FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 1, version: 0x00040011, at 0xfee00000 io0 (APIC): apic id: 0, version: 0x00170011, at 0xfec00000 Preloaded elf kernel "kernel.smp" at 0xc0452000. Pentium Pro MTRR support enabled md0: Malloc disk npx0: <math processor> on motherboard npx0: INT 16 interface pcib0: <Intel 82443GX host to PCI bridge> on motherboard pci0: <PCI bus> on pcib0 pcib2: <Intel 82443GX (440 GX) PCI-PCI (AGP) bridge> at device 1.0 on pci0 pci1: <PCI bus> on pcib2 pcib3: <PCI to PCI bridge (vendor=1011 device=0023)> at device 15.0 on pci1 pci2: <PCI bus> on pcib3 [ Here are the lines for the new SCSI card...this card takes ahc[0-1] when installed, normally, the internal card has ahc[0-1] ] ahc0: <Adaptec 3960D Ultra160 SCSI adapter> port 0x2000-0x20ff mem 0xf4100000-0xf4100fff irq 18 at device 11.0 on pci0 aic7899: Wide Channel A, SCSI Id=7, 32/255 SCBs ahc1: <Adaptec 3960D Ultra160 SCSI adapter> port 0x2400-0x24ff mem 0xf4101000-0xf4101fff irq 23 at device 11.1 on pci0 aic7899: Wide Channel B, SCSI Id=7, 32/255 SCBs ahc2: <Adaptec aic7896/97 Ultra2 SCSI adapter> port 0x2800-0x28ff mem 0xf4102000-0xf4102fff irq 19 at device 12.0 on pci0 aic7896/97: Wide Channel A, SCSI Id=7, 32/255 SCBs ahc3: <Adaptec aic7896/97 Ultra2 SCSI adapter> port 0x2c00-0x2cff mem 0xf4103000-0xf4103fff irq 19 at device 12.1 on pci0 aic7896/97: Wide Channel B, SCSI Id=7, 32/255 SCBs fxp0: <Intel Pro 10/100B/100+ Ethernet> port 0x3000-0x303f mem 0xf4000000-0xf40fffff,0xf4104000-0xf4104fff irq 21 at device 14.0 on pci0 fxp0: Ethernet address 00:d0:b7:89:0e:24 isab0: <Intel 82371AB PCI to ISA bridge> at device 18.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <Intel PIIX4 ATA33 controller> port 0x3060-0x306f at device 18.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 pci0: <Intel 82371AB/EB (PIIX4) USB controller> at 18.2 irq 21 Timecounter "PIIX" frequency 3579545 Hz chip1: <Intel 82371AB Power management controller> port 0x1040-0x104f at device 18.3 on pci0 pci0: <Cirrus Logic GD5480 SVGA controller> at 20.0 pcib1: <Intel 82443GX host to AGP bridge> on motherboard pci3: <PCI bus> on pcib1 eisa0: <EISA bus> on motherboard eisa0: unknown card @@@0000 (0x00000000) at slot 2 fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 sc0: <System console> on isa0 sc0: VGA <16 virtual consoles, flags=0x0> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A, console sio1 at port 0x2f8-0x2ff irq 3 on isa0 sio1: type 16550A ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0 ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/8 bytes threshold plip0: <PLIP network interface> on ppbus0 lpt0: <Printer> on ppbus0 lpt0: Interrupt-driven port ppi0: <Parallel I/O> on ppbus0 APIC_IO: Testing 8254 interrupt delivery APIC_IO: routing 8254 via IOAPIC #0 intpin 2 IPsec: Initialized Security Association Processing. ata0-slave: ata_command: timeout waiting for intr ata0-slave: identify failed acd0: CDROM <CD-540E> at ata0-master using PIO4 Waiting 5 seconds for SCSI devices to settle [ The system hangs here for a 60-90 seconds, then the following messages show up ] (probe45:ahc3:0:0:0): SCB 0x9 - timed out while idle, SEQADDR == 0x3e [ N.B. : this is the B channel of the internal SCSI card, which works fine when the second SCSI card is pulled out again ] STACK == 0x0, 0x0, 0x0, 0x1 SXFRCTL0 == 0x80 SCB count = 20 QINFIFO entries: 9 8 7 6 5 4 3 2 1 0 19 18 17 16 15 Waiting Queue entries: Disconnected Queue entries: QOUTFIFO entries: Sequencer Free SCB List: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Pending list: 15 16 17 18 19 0 1 2 3 4 5 6 7 8 9 Kernel Free SCB list: 13 12 11 10 Untagged Q(0): 9 Untagged Q(1): 8 Untagged Q(2): 7 Untagged Q(3): 6 Untagged Q(4): 18 Untagged Q(5): 17 Untagged Q(6): 5 Untagged Q(8): 4 Untagged Q(9): 16 Untagged Q(10): 3 Untagged Q(11): 2 Untagged Q(12): 15 Untagged Q(13): 1 Untagged Q(14): 0 Untagged Q(15): 19 sg[0] - Addr 0xd9b0684 : Length 36 (probe45:ahc3:0:0:0): SCB 9: Immediate reset. Flags = 0x6040 (probe45:ahc3:0:0:0): no longer in timeout, status = 34b ahc3: Issued Channel A Bus Reset. 15 SCBs aborted (probe45:ahc3:0:0:0): SCB 0xe - timed out while idle, SEQADDR == 0x3e STACK == 0x0, 0x0, 0x1, 0x1 SXFRCTL0 == 0x80 SCB count = 20 QINFIFO entries: 14 15 16 17 18 19 0 1 2 3 4 5 6 7 8 Waiting Queue entries: Disconnected Queue entries: QOUTFIFO entries: Sequencer Free SCB List: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Pending list: 8 7 6 5 4 3 2 1 0 19 18 17 16 15 14 Kernel Free SCB list: 13 12 11 10 Untagged Q(0): 14 Untagged Q(1): 15 Untagged Q(2): 16 Untagged Q(3): 17 Untagged Q(4): 5 Untagged Q(5): 6 Untagged Q(6): 18 Untagged Q(8): 19 Untagged Q(9): 7 Untagged Q(10): 0 Untagged Q(11): 1 Untagged Q(12): 8 Untagged Q(13): 2 Untagged Q(14): 3 Untagged Q(15): 4 sg[0] - Addr 0xd9b0684 : Length 36 (probe45:ahc3:0:0:0): SCB 14: Immediate reset. Flags = 0x6040 (probe45:ahc3:0:0:0): no longer in timeout, status = 34b ahc3: Issued Channel A Bus Reset. 15 SCBs aborted ----------------------------------------------------------------- - These probe/timeout messages continue on at this point. Interestingly enough, scroll-lock works at this point, but serial break doesn't. - As another data point, I disabled the parallel port in the BIOS, compiled a new (non-smp) kernel without parallel support, and used the L440GX+ CDROM to set the internal SCSI card's IRQ to 7, then booted. The system hung after the probes of the serial ports. It's very possible this is the same place it was hanging before, as the parallel port probes were obviously absent - What convinced me that this is a FreeBSD problem was booting off a Linux emergency floppy, and watching probe the SCSI cards and then the drives themselves (which FBSD never got to with the second card in). I was both relieved and disturbed at the same time. Fix: I may try debugging the kernel, though it's not something I've ever done, and I don't really know what I'm looking for. Getting this working is a pretty big deal, and at this point I'm willing to try just about anything. How-To-Repeat: I'm not completely sure, but getting the same hardware setup and trying 4.2 STABLE might work :->