After installing 2.2.1 my system would crash when doing tape backups. This seems to happen mostly when it is closing the tape device. Once the system did not crash and I got saved this dmesg log. I do beleive that other crashes show similar messags on the console before the reboot. Copyright (c) 1992-1996 FreeBSD Inc. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. FreeBSD 2.2.1-RELEASE #0: Mon Jun 23 11:13:15 PDT 1997 rootz@raindrop.seaslug.org:/usr/src/sys/compile/RAINDROP CPU: Pentium (99.87-MHz 586-class CPU) Origin = "GenuineIntel" Id = 0x525 Stepping=5 Features=0x3bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8,APIC> real memory = 33554432 (32768K bytes) avail memory = 31145984 (30416K bytes) Probing for devices on PCI bus 0: chip0 <Intel 82434NX (Neptune) PCI cache memory controller> rev 17 on pci0:0 chip1 <Intel 82375EB PCI-EISA bridge> rev 3 on pci0:2:0 vga0 <VGA-compatible display device> rev 3 on pci0:14 bt0 <Buslogic 946 SCSI host adapter> rev 0 int a irq 11 on pci0:15 bt0: Bt946C/ 0-(32bit) bus bt0: reading board settings, busmastering, int=11 bt0: version 4.28D, async only, parity, 32 mbxs, 32 ccbs bt0: targ 0 async bt0: targ 3 async bt0: targ 5 async bt0: Using Strict Round robin scheme bt0 waiting for scsi devices to settle (bt0:0:0): "SEAGATE ST31230N 0594" type 0 fixed SCSI 2 sd0(bt0:0:0): Direct-Access 1010MB (2069860 512 byte sectors) (bt0:3:0): "SONY CD-ROM CDU-76S 1.1c" type 5 removable SCSI 2 cd0(bt0:3:0): CD-ROM can't get the size (bt0:5:0): "WANGTEK 6130-FS 3.04" type 1 removable SCSI 2 st0(bt0:5:0): Sequential-Access density code 0x13, drive empty Probing for devices on the ISA bus: sc0 at 0x60-0x6f irq 1 on motherboard sc0: VGA color <16 virtual consoles, flags=0x0> sio0 at 0x3f8-0x3ff irq 4 on isa sio0: type 16550A sio1 at 0x2f8-0x2ff irq 3 on isa sio1: type 16550A lpt0 at 0x378-0x37f irq 7 on isa lpt0: Interrupt-driven port lp0: TCP/IP capable interface mse0 at 0x23c irq 5 on isa fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa fdc0: NEC 72065B fd0: 1.44MB 3.5in bt: unit number (1) too high bt1 not found at 0x330 1 3C5x9 board(s) on ISA found at 0x300 ep0 at 0x300-0x30f irq 10 on isa ep0: aui/utp/bnc[*BNC*] address 00:a0:24:2a:3c:77 npx0 on motherboard npx0: INT 16 interface bt0: Try to abort bt0: Abort Operation has timed out st0(bt0:5:0): ILLEGAL REQUEST asc:26,0 Invalid field in parameter list sks:8f,0 Reading from the tape does not seem to cause the problem. Looks similar to problem reports kern/2980 and kern/3671 Tom Unger unger@raindrop.seaslug.org Fix: No known work around. I have to plan on having my system crash after every backup. How-To-Repeat: Pretty much anytime I do a backup I see this problem.
Responsible Changed From-To: freebsd-bugs->gibbs I'll probably end up fixing this.
I'm seeing the same problem using FreeBSD 2.2.5. Here's the portion of my dmesg where the drives are detected. Note that I'm using both the wide and narrow SCSI chain (which shouldn't matter). bt0 <Buslogic 946 SCSI host adapter> rev 6 int a irq 15 on pci0:9 bt0: Bt956C/ 0-(32bit) bus bt0: reading board settings, busmastering, int=15 bt0: version 4.28A, async only, no parity, 32 mbxs, 32 ccbs bt0: targ 0 async bt0: targ 1 async bt0: targ 6 async bt0: Using Strict Round robin scheme bt0 waiting for scsi devices to settle (bt0:0:0): "SEAGATE ST31230W 0510" type 0 fixed SCSI 2 sd0(bt0:0:0): Direct-Access 1010MB (2069860 512 byte sectors) (bt0:1:0): "HP HP35480A A" type 1 removable SCSI 2 st0(bt0:1:0): Sequential-Access density code 0x13, drive empty (bt0:6:0): "SEAGATE ST34371N 0280" type 0 fixed SCSI 2 sd1(bt0:6:0): Direct-Access 4148MB (8496960 512 byte sectors) Exactly the same fault - the controller locks (SCSI chain dead, get neato console messages like "bt0: Abort Operation has timed out" and "bt0: not responding to commands") and then have to reboot the machine. Running processes that don't need to access the disk keep working. Because disk access is locked it's not logging these errors - all I see in the message log is "bt0: Try to abort". I did not have this problem when using an Adaptec 2940 and a DAT drive, it's only manifested since I went to the Buslogic card. Are there any solutions besides "try playing with the timeouts in the st driver" or going to a different OS? Thanks, Bryan
I've spent a little time working on this problem myself. I got the kernel sources and went through st.c and increased command timeouts by a factor of 4. Then I enabled debugging. Here is a trace from a 'dump 0fB /dev/rst0 9999999 /' command. Since the computer usually crashes I'm running these test in single use mode with all file systems mounted read only. [ TIMEOUTFACTOR = 4, scsi_prevent not changed.] ... ------------------------------ 000: 6f 72 74 09 31 39 39 39 2f 75 64 70 20 20 20 23 016: 63 69 73 63 6f 20 69 64 65 6e 74 69 66 69 63 61 032: 74 69 6f 6e 20 70 6f 72 74 0a 63 61 6c 6c 62 6f 048: 6f 6b 09 32 30 30 30 2f 74 63 70 0a 63 61 6c 6c ------------------------------ st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): ststrategy st0(bt0:5:0): 10240 bytes @ blk54400 st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: a,0,0,28,0,0-[10240 bytes] ------------------------------ 000: 05 00 00 00 b1 05 ce 34 00 00 00 00 01 00 00 00 016: 40 6a 00 00 ff 0e 00 00 6c ea 00 00 03 e6 04 6c 032: a4 81 01 00 00 00 00 00 9d ef 00 00 00 00 00 00 048: 80 03 ce 34 00 00 00 00 2b a1 8b 34 00 00 00 00 ------------------------------ st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): stclose: Closing device st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 10,0,0,0,1,0-[0 bytes] st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd ccb:f07a4e00 op:0 cmdlen:6 senlen:32 datlen:0 hstat:0 tstat:0 flags:1 bt0: Try to abort ccb:f07a4e00 op:0 cmdlen:6 senlen:32 datlen:0 hstat:0 tstat:0 flags:2 bt0: Abort Operation has timed out st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 1e,0,0,0,0,0-[0 bytes] st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): unmounting st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 10,0,0,0,1,0-[0 bytes] st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 1,0,0,0,0,0-[0 bytes] st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 0,0,0,0,0,0-[0 bytes] st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 0,0,0,0,0,0-[0 bytes] st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 0,0,0,0,0,0-[0 bytes] st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): mounting st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 1b,0,0,0,1,0-[0 bytes] st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 0,0,0,0,0,0-[0 bytes] st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 5,0,0,0,0,0-[6 bytes] ------------------------------ 000: 00 ff ff ff 00 01 ------------------------------ st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 1a,0,0,0,c,0-[12 bytes] ------------------------------ 000: 0b 00 10 08 13 00 00 00 00 00 00 00 ------------------------------ st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): starting block mode decision st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 15,0,0,0,c,0-[12 bytes] ------------------------------ 000: 00 00 10 08 13 00 00 00 00 00 00 00 ------------------------------ st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 1e,0,0,0,1,0-[0 bytes] st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): Open complete st0(bt0:5:0): stopen: dev=0xe00 (unit 0) result 0 st0(bt0:5:0): stclose: Closing device st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 1e,0,0,0,0,0-[0 bytes] st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): unmounting st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 1,0,0,0,0,0-[0 bytes] st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart Note that as the system closes the device there is a timeout. I believe this is the error that I've been seeing. Reading kern/1467 I found the suggestion that the timeout in st_prevent was too low. I increased it from 5000 to 300000 and ran the dump again. Success!. So I tried to do 2 dumps onto the same tape using /dev/nrst0. On the second dump I ran into problems. On the first test the scsi device seemed to stop responding to commands. The system ended up in a loop where dump was receiving an error, trying a new operation, receiving an erro, trying a new op... so the second test I aborted dump at the first sign of trouble. Here is the log: 09 016: 31 35 30 33 2f 74 63 70 20 20 20 20 23 44 61 74 032: 61 62 65 61 6d 0a 69 6d 74 63 2d 6d 63 73 09 31 048: 35 30 33 2f 75 64 70 20 20 20 20 23 44 61 74 61 ------------------------------ st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): ststrategy st0(bt0:5:0): 10240 bytes @ blk54420 st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: a,0,0,28,0,0-[10240 bytes] ------------------------------ 000: 20 20 23 56 41 54 20 64 65 66 61 75 6c 74 20 63 016: 6f 6e 74 72 6f 6c 0a 75 64 74 5f 6f 73 09 09 33 032: 39 30 30 2f 74 63 70 20 20 20 23 55 6e 69 64 61 048: 74 61 20 55 44 54 20 4f 53 0a 75 64 74 5f 6f 73 ------------------------------ st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): stclose: Closing device st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 10,0,0,0,1,0-[0 bytes] st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 1e,0,0,0,0,0-[0 bytes] st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 0,0,0,0,0,0-[0 bytes] st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 0,0,0,0,0,0-[0 bytes] st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 1e,0,0,0,1,0-[0 bytes] st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): Open complete st0(bt0:5:0): stopen: dev=0xe01 (unit 0) result 0 st0(bt0:5:0): stclose: Closing device st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 1e,0,0,0,0,0-[0 bytes] st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 0,0,0,0,0,0-[0 bytes] st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 0,0,0,0,0,0-[0 bytes] st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 1e,0,0,0,1,0-[0 bytes] st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): Open complete st0(bt0:5:0): stopen: dev=0xe01 (unit 0) result 0 st0(bt0:5:0): ststrategy st0(bt0:5:0): 10240 bytes @ blk0 st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: a,0,0,28,0,0-[10240 bytes] ------------------------------ 000: 01 00 00 00 2b 14 ce 34 00 00 00 00 01 00 00 00 016: 00 00 00 00 00 00 00 00 6c ea 00 00 d3 5c 33 4c 032: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 048: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ------------------------------ st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): ststrategy st0(bt0:5:0): 10240 bytes @ blk20 st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: a,0,0,28,0,0-[10240 bytes] ------------------------------ 000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 016: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 032: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 048: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ------------------------------ st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): ststrategy st0(bt0:5:0): 10240 bytes @ blk40 st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: a,0,0,28,0,0-[10240 bytes] ------------------------------ 000: 02 00 00 00 2b 14 ce 34 00 00 00 00 01 00 00 00 016: 14 00 00 00 08 00 00 00 6c ea 00 00 4b e7 44 7d 032: e8 41 02 00 00 00 00 00 00 02 00 00 00 00 00 00 048: 4c b1 cd 34 00 00 00 00 be 71 8b 34 00 00 00 00 ------------------------------ st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): ststrategy st0(bt0:5:0): 10240 bytes @ blk60 st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: a,0,0,28,0,0-[10240 bytes] ------------------------------ 000: 0d 00 00 00 0c 00 04 01 2e 00 00 00 02 00 00 00 016: 0c 00 04 02 2e 2e 00 00 0e 00 00 00 18 00 08 0f 032: 6c 6f 63 61 74 65 2e 64 61 74 61 62 61 73 65 00 048: 3d 00 00 00 14 00 08 0a 6d 6f 75 6e 74 64 62 74 ------------------------------ st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): ststrategy st0(bt0:5:0): 10240 bytes @ blk80 st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: a,0,0,28,0,0-[10240 bytes] ------------------------------ 000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 016: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 032: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 048: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ------------------------------ st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): ststrategy st0(bt0:5:0): 10240 bytes @ blk100 st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: a,0,0,28,0,0-[10240 bytes] ------------------------------ 000: 02 00 00 00 2b 14 ce 34 00 00 00 00 01 00 00 00 016: 32 00 00 00 1c 00 00 00 6c ea 00 00 15 63 7b 7d 032: ed 41 0a 00 00 00 00 00 00 02 00 00 00 00 00 00 048: 3d ee cd 34 00 00 00 00 8e b8 4c 34 00 00 00 00 ------------------------------ st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): ststrategy st0(bt0:5:0): 10240 bytes @ blk120 st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: a,0,0,28,0,0-[10240 bytes] ------------------------------ 000: 1f 00 00 00 0c 00 04 01 2e 00 00 00 1c 00 00 00 016: f4 01 04 02 2e 2e 00 00 da 02 00 00 14 00 08 0a 032: 71 66 49 41 41 30 30 33 31 34 00 f2 db 02 00 00 048: d4 01 08 0a 78 66 49 41 41 30 30 33 31 34 00 f2 ------------------------------ st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): ststrategy st0(bt0:5:0): 10240 bytes @ blk140 st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: a,0,0,28,0,0-[10240 bytes] ------------------------------ 000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 016: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 032: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 048: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ------------------------------ st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): ststrategy st0(bt0:5:0): 10240 bytes @ blk160 st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd ccb:f07a7600 op:2 cmdlen:10 senlen:32 datlen:24 hstat:0 tstat:0 flags:1 ccb:f0773e00 op:2 cmdlen:6 senlen:32 datlen:16 hstat:0 tstat:0 flags:1 bt0: Try to abort ccb:f07a7600 op:2 cmdlen:10 senlen:32 datlen:24 hstat:2a tstat:0 flags:1 ccb:f0773e00 op:2 cmdlen:6 senlen:32 datlen:16 hstat:0 tstat:0 flags:1 bt0: Try to abort [ it is somewhere in here that I interupt 'dump' ] st0(bt0:5:0): scsi_done st0(bt0:5:0): command: a,0,0,28,0,0-[10240 bytes] ------------------------------ 000: 02 00 00 00 2b 14 ce 34 00 00 00 00 01 00 00 00 016: 50 00 00 00 26 00 00 00 6c ea 00 00 01 42 83 7d 032: fd 41 02 00 00 00 00 00 00 02 00 00 00 00 00 00 048: 81 0e ce 34 00 00 00 00 8e b8 4c 34 00 00 00 00 ------------------------------ st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): stclose: Closing device st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 10,0,0,0,1,0-[0 bytes] code70 valid0 seg0 key4 ili0 eom0 fmark0 info: 0 0 0 0 followed by 10 extra bytes extra: 0 0 0 0 0 0 0 0 0 0 st0(bt0:5:0): calling private err_handler() st0(bt0:5:0): private err_handler() returned -1 st0(bt0:5:0): HARDWARE FAILURE asc:44,0 Internal target failure st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart st0(bt0:5:0): scsi_cmd st0(bt0:5:0): scsi_done st0(bt0:5:0): command: 1e,0,0,0,0,0-[0 bytes] code70 valid0 seg0 key4 ili0 eom0 fmark0 info: 0 0 0 0 followed by 10 extra bytes extra: 0 0 0 0 0 0 0 0 0 0 st0(bt0:5:0): calling private err_handler() st0(bt0:5:0): private err_handler() returned -1 st0(bt0:5:0): calling private start() st0(bt0:5:0): ststart So, I don't know. I don't know much about scsi and to my untrained eye it is not at all clear what is going on. I'll keep working on this but I sure could use some suggestions. Thanks, Tom Unger unger@raindrop.seaslug.org
Seems a similar problem is described for exabyte streamers: http://www.exabyte.com/suppserv/techsupp/8mm/misc/in0181.html (it's a page for using those streamers under linux, but I use freebsd, and the instructions seems quite compitable :-) /Peter
State Changed From-To: open->closed Both the tape and Buslogic MultiMaster driver are new works in 3.0R. Please retest these scenarios on this equipment and file new bug reports if your problems persist.