Hi! Starting from this thread: https://forums.freenas.org/index.php?threads/a-long-journey-to-solve-cam-status-command-timeout-issues.59549/ I had hard times troubleshooting this issue on FreeNAS with 4 x ST8000AS0002 Drives. It seems that drives increase latency over time and exceed the AHCI (hardcoded?) 31000 msec timeout. It will be great if FreeBSD could support these new SMR drives without disabling cache flushing on ZFS. Thanks, Fabio
Same problem with sparc64. I have just upgrade my Sun Ultra2 from Free-BSD 10.1 to 11.1 and I got some error messages and the running commands freeze. # dmesg Copyright (c) 1992-2017 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 11.1-RELEASE #0 r321309: Fri Jul 21 08:46:54 UTC 2017 root@releng2.nyi.freebsd.org:/usr/obj/sparc64.sparc64/usr/src/sys/GENERIC sparc64 gcc version 4.2.1 20070831 patched [FreeBSD] real memory = 1073741824 (1024 MB) avail memory = 1030152192 (982 MB) cpu0: Sun Microsystems UltraSparc-II Processor (296.01 MHz CPU) cpu1: Sun Microsystems UltraSparc-II Processor (296.01 MHz CPU) FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs random: unblocking device. random: entropy device external interface kbd0 at kbdmux0 nexus0: <Open Firmware Nexus device> sbus0: <U2S UPA-SBus bridge> mem 0x1fe00000000-0x1fe00007fff irq 2036,2037,2038,2021,2026,2039 on nexus0 sbus0: clock 25.000 MHz sbus0: DVMA map: 0xfc000000 to 0xffffffff 8192 entries, streaming buffer sbus0: [GIANT-LOCKED] sbus0: [GIANT-LOCKED] initializing counter-timer Timecounter "sbus0" frequency 1000000 Hz quality 100 auxio0: <Sun Auxiliary I/O> mem 0x1900000 on sbus0 pcm0: <Sun Audiocs> mem 0xc000000-0xc0001ff irq 2020 on sbus0 pcm0: <CS4231A Codec Id. 10> sbus0: <flashprom> mem 0-0x7ffff,0x1380000-0x13fffff type unknown (no driver attached) sbus0: <SUNW,fdtwo> mem 0x1400000-0x1400007 irq 2025 type block (no driver attached) eeprom0: <EEPROM/clock> mem 0x1200000-0x1201fff on sbus0 eeprom0: model mk48t59 scc0: <Zilog Z8530 dual channel SCC> mem 0x1100000-0x1100003 irq 2024 on sbus0 uart0: <z8530, channel A> on scc0 uart0: stray vector interrupt 2033 console (38400,n,8,1) uart1: <z8530, channel B> on scc0 scc1: <Zilog Z8530 dual channel SCC> mem 0x1000000-0x1000003 irq 2024 on sbus0 uart2: <z8530, channel A> on scc1 uart2: keyboard (1200,n,8,1) uart2: keyboard not present uart3: <z8530, channel B> on scc1 sbus0: <sc> mem 0x1300000-0x1300007 type unknown (no driver attached) sbus0: <SUNW,pll> mem 0x1304000-0x1304002 type unknown (no driver attached) esp0: <Sun FAS366 Fast-Wide SCSI> mem 0x8800000-0x880000f,0x8810000-0x881003f irq 2016 on sbus0 esp0: FAS366/HME, 40 MHz, SCSI ID 7 hme0: <Sun HME 10/100 Ethernet> mem 0x8c00000-0x8c00107,0x8c02000-0x8c03fff,0x8c04000-0x8c05fff,0x8c06000-0x8c07fff,0x8c07000-0x8c0701f irq 2017 on sbus0 miibus0: <MII bus> on hme0 nsphy0: <DP83840 10/100 media interface> PHY 1 on miibus0 nsphy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto hme0: Ethernet address: xx:xx:xx:xx:xx:xx sbus0: <SUNW,bpp> mem 0xc800000-0xc80001b irq 2018 type unknown (no driver attached) sbus0: <nm212> mem 0-0x1ffff,0x20000-0x9ffff,0xa0000-0xdffff,0xe0000-0xe00ff irq 1996 type network (no driver attached) creator0: <Creator3D> mem 0x1fc00000000-0x1fc000003ff,0x1fc00400000-0x1fc005fffff,0x1fc00600000-0x1fc007fffff,0x1fc01000000-0x1fc013fffff,0x1fc01400000-0x1fc017fffff,0x1fc01800000-0x1fc01bfffff,0x1fc01c00000-0x1fc01ffffff,0x1fc02000000-0x1fc02ffffff,0x1fc03000000-0x1fc03ffffff,0x1fc04000000-0x1fc043fffff,0x1fc04400000-0x1fc047fffff,0x1fc04800000-0x1fc04bfffff,0x1fc04c00000-0x1fc04ffffff,0x1fc05000000-0x1fc05ffffff,0x1fc06000000-0x1fc07ffffff,0x1fc09000000-0x1fc097fffff,0x1fc09800000-0x1fc09ffffff,0x1fc0a000000-0x1fc0affffff,0x1fc0b000000-0x1fc0b7fffff,0x1fc0b800000-0x1fc0bffffff,0x1fc0c000000-0x1fc0c3fffff,0x1fc0c800000-0x1fc0cffffff,0x1fc0d000000-0x1fc0d7fffff,0x1fc0d800000-0x1fc0dffffff irq 1925 on nexus0 creator0: resolution 1152x900 cryptosoft0: <software crypto> on nexus0 syscons0: <System console> on nexus0 syscons0: Unknown <16 virtual consoles, flags=0x100> Timecounter "tick" frequency 296008859 Hz quality 10 Event timer "tick" frequency 296008859 Hz quality 1000 Timecounters tick every 1.000 msec usb_needs_explore_all: no devclass SMP: AP CPU #1 Launched! Trying to mount root from ufs:/dev/da0a [rw]... da0 at esp0 bus 0 scbus0 target 0 lun 0 da0: <IBM DNES30917SUN9.0G SAD0> Fixed Direct Access SCSI-3 device da0: Serial Number 27L734 da0: 20.000MB/s transfers (10.000MHz, offset 5, 16bit) da0: Command Queueing enabled da0: 8637MB (17689267 512 byte sectors) cd0 at esp0 bus 0 scbus0 target 6 lun 0 cd0: <TOSHIBA DVD-ROM SD-M1401 1009> Removable CD-ROM SCSI-2 device cd0: 10.000MB/s transfers (10.000MHz, offset 15) cd0: Attempt to query device size failed: NOT READY, Logical unit is in process of becoming ready hme0: link state changed to DOWN hme0: link state changed to UP (da0:esp0:0:0:0): esp0: timed out [ecb 0xc1681ce0 (flags 0x1, dleft 1000, stat 0)], <state 2, nexus 0xc1680888, phase(l 97, c 100, p 3), resid 1000, msg(q 0,o 40) > (da0:esp0:0:0:0): esp0: timed out [ecb 0xc1683138 (flags 0x1, dleft 1000, stat 0)], <state 1, nexus 0, phase(l 97, c 100, p 7), resid 1000, msg(q 0,o 40) > (da0:esp0:0:0:0): esp0: timed out [ecb 0xc1687770 (flags 0x1, dleft 1000, stat 0)], <state 1, nexus 0, phase(l 97, c 100, p 7), resid 1000, msg(q 0,o 40) > (da0:esp0:0:0:0): esp0: timed out [ecb 0xc1687ab8 (flags 0x1, dleft 1000, stat 0)], <state 1, nexus 0, phase(l 97, c 100, p 7), resid 1000, msg(q 0,o 40) > (da0:esp0:0:0:0): esp0: timed out [ecb 0xc16842f0 (flags 0x1, dleft 1000, stat 0)], <state 4, nexus 0, phase(l 97, c 7, p 7), resid 1000, msg(q 0,o 0) > (da0:esp0:0:0:0): esp0: timed out [ecb 0xc1688928 (flags 0x1, dleft 8000, stat 0)], <state 4, nexus 0, phase(l 97, c 7, p 7), resid 1000, msg(q 0,o 0) > (da0:esp0:0:0:0): esp0: timed out [ecb 0xc1682178 (flags 0x1, dleft 8000, stat 0)], <state 4, nexus 0, phase(l 97, c 7, p 7), resid 1000, msg(q 0,o 0) > (da0:esp0:0:0:0): esp0: timed out [ecb 0xc1680930 (flags 0x1, dleft 1000, stat 0)], <state 4, nexus 0, phase(l 97, c 7, p 7), resid 1000, msg(q 0,o 0) > (da0:esp0:0:0:0): esp0: timed out [ecb 0xc1682370 (flags 0x1, dleft 1000, stat 0)], <state 4, nexus 0, phase(l 97, c 7, p 7), resid 1000, msg(q 0,o 0) > (da0:esp0:0:0:0): esp0: timed out [ecb 0xc1686078 (flags 0x1, dleft 1000, stat 0)], <state 4, nexus 0, phase(l 97, c 7, p 7), resid 1000, msg(q 0,o 0) > (da0:esp0:0:0:0): esp0: timed out [ecb 0xc1687cb0 (flags 0x1, dleft 1000, stat 0)], <state 5, nexus 0xc168a4b8, phase(l 93, c 7, p 7), resid 1000, msg(q 0,o 0) > (da0:esp0:0:0:0): esp0: timed out [ecb 0xc16887d8 (flags 0x101, dleft 1000, stat 0)], <state 5, nexus 0xc168a4b8, phase(l 97, c 3, p 3), resid 1000, msg(q 0,o 0) > (da0:esp0:0:0:0): esp0: timed out [ecb 0xc1685b38 (flags 0x1, dleft 1000, stat 0)], <state 1, nexus 0, phase(l 97, c 100, p 3), resid 1000, msg(q 0,o 0) > (da0:esp0:0:0:0): esp0: timed out [ecb 0xc1680888 (flags 0x1, dleft 1000, stat 0)], <state 4, nexus 0, phase(l 97, c 7, p 3), resid 1000, msg(q 0,o 0) > (da0:esp0:0:0:0): esp0: timed out [ecb 0xc16885e0 (flags 0x1, dleft 1000, stat 0)], <state 4, nexus 0, phase(l 97, c 7, p 7), resid 1000, msg(q 0,o 0) > (da0:esp0:0:0:0): WRITE(10). CDB: 2a 00 00 9f ed a0 00 00 08 00 (da0:esp0:0:0:0): CAM status: Command timeout (da0:esp0:0:0:0): Retrying command
+1, Having the same problem on FreeBSD 12.0-p3 (latest upgrades). Hardware: QNAP with 4x 4TB WD Red drives. ada0: <WDC WD40EFRX-68N32N0 82.00A82> ACS-3 ATA SATA 3.x device Drives get detached only when doing a scrub. 11:39:53 kernel: ahcich3: Timeout on slot 9 port 0 11:39:53 kernel: ahcich3: is 00000000 cs 0001f800 ss 0001ff00 rs 0001ff00 tfd 40 serr 00000000 cmd 0004cb17 11:39:53 kernel: (ada3:ahcich3:0:0:0): READ_FPDMA_QUEUED. ACB: 60 00 d0 9a 9e 40 20 00 00 01 00 00 11:39:53 kernel: (ada3:ahcich3:0:0:0): CAM status: Command timeout 11:39:53 kernel: (ada3:ahcich3:0:0:0): Retrying command, 3 more tries remain 11:41:32 kernel: ahcich3: Timeout on slot 8 port 0 11:41:32 kernel: ahcich3: is 00000000 cs 0000c000 ss 0000fff0 rs 0000fff0 tfd 40 serr 00000000 cmd 0004ce17 11:41:32 kernel: (ada3:ahcich3:0:0:0): READ_FPDMA_QUEUED. ACB: 60 00 80 e6 f4 40 21 00 00 01 00 00 11:41:32 kernel: (ada3:ahcich3:0:0:0): CAM status: Command timeout 11:41:32 kernel: (ada3:ahcich3:0:0:0): Retrying command, 3 more tries remain 11:46:53 kernel: ahcich2: Timeout on slot 5 port 0 11:46:53 kernel: ahcich2: is 00000000 cs 00070000 ss 0007ffe0 rs 0007ffe0 tfd 40 serr 00000000 cmd 0004d017 11:46:53 kernel: (ada2:ahcich2:0:0:0): READ_FPDMA_QUEUED. ACB: 60 00 a8 bd 0f 40 34 00 00 01 00 00 11:46:53 kernel: (ada2:ahcich2:0:0:0): CAM status: Command timeout 11:46:53 kernel: (ada2:ahcich2:0:0:0): Retrying command, 3 more tries remain 11:49:21 kernel: ahcich2: Timeout on slot 31 port 0 11:49:21 kernel: ahcich2: is 00000000 cs 00020000 ss 8003ffff rs 8003ffff tfd 40 serr 00000000 cmd 0004d017 11:49:21 kernel: (ada2:ahcich2:0:0:0): READ_FPDMA_QUEUED. ACB: 60 00 28 7a 4a 40 35 00 00 01 00 00 11:49:21 kernel: (ada2:ahcich2:0:0:0): CAM status: Command timeout 11:49:21 kernel: (ada2:ahcich2:0:0:0): Retrying command, 3 more tries remain 11:49:21 kernel: ada2 at ahcich2 bus 0 scbus2 target 0 lun 0 11:49:21 kernel: ada2: < > detached 11:49:21 kernel: GEOM_ELI: g_eli_write_done() failed (error=6) ada2p1.eli[WRITE(offset=10917146624, length=4096)] 11:49:21 kernel: GEOM_ELI: g_eli_read_done() failed (error=6) ada2p1.eli[READ(offset=270336, length=8192)] 11:49:21 kernel: GEOM_ELI: g_eli_read_done() failed (error=6) ada2p1.eli[READ(offset=34358960128, length=8192)] 11:49:21 kernel: GEOM_ELI: g_eli_read_done() failed (error=6) ada2p1.eli[READ(offset=34359222272, length=8192)] 11:49:23 kernel: GEOM_ELI: g_eli_read_done() failed (error=6) ada2p1.eli[READ(offset=270336, length=8192)] 11:49:23 kernel: GEOM_ELI: g_eli_read_done() failed (error=6) ada2p1.eli[READ(offset=34358960128, length=8192)] 11:49:23 kernel: GEOM_ELI: g_eli_read_done() failed (error=6) ada2p1.eli[READ(offset=34359222272, length=8192)] 11:49:23 ZFS[61041]: vdev state changed, pool_guid= vdev_guid= 11:49:23 ZFS[64175]: vdev state changed, pool_guid= vdev_guid= 11:49:51 kernel: ahcich2: Timeout on slot 19 port 0 11:49:51 kernel: ahcich2: is 00000000 cs 00080000 ss 00000000 rs 00080000 tfd 50 serr 00000000 cmd 0004d217 11:49:51 kernel: (aprobe0:ahcich2:0:0:0): SETFEATURES SET TRANSFER MODE. ACB: ef 03 00 00 00 40 00 00 00 00 08 00 11:49:51 kernel: (aprobe0:ahcich2:0:0:0): CAM status: Command timeout 11:49:51 kernel: (aprobe0:ahcich2:0:0:0): Retrying command, 0 more tries remain 11:49:51 kernel: GEOM_ELI: g_eli_read_done() failed (error=6) gpt/storage2.eli[READ(offset=423407169536, length=1003520)] 11:49:51 kernel: GEOM_ELI: g_eli_read_done() failed (error=6) gpt/storage2.eli[READ(offset=423408177152, length=1011712)] 11:49:51 kernel: GEOM_ELI: g_eli_read_done() failed (error=6) gpt/storage2.eli[READ(offset=270336, length=8192)] 11:49:51 kernel: GEOM_ELI: g_eli_read_done() failed (error=6) gpt/storage2.eli[READ(offset=3966424588288, length=8192)] 11:49:51 kernel: GEOM_ELI: g_eli_read_done() failed (error=6) gpt/storage2.eli[READ(offset=3966424850432, length=8192)] 11:49:51 kernel: GEOM_ELI: g_eli_read_done() failed (error=6) gpt/storage2.eli[READ(offset=423409188864, length=1011712)] 11:49:51 kernel: GEOM_ELI: g_eli_read_done() failed (error=6) gpt/storage2.eli[READ(offset=423410204672, length=1011712)] 11:49:51 ZFS[72126]: vdev state changed, pool_guid= vdev_guid= 11:49:51 kernel: GEOM_ELI: Device ada2p1.eli destroyed. 11:49:51 kernel: GEOM_ELI: Detached ada2p1.eli on last close. 11:49:51 ZFS[79977]: vdev is removed, pool_guid= vdev_guid= 11:49:51 kernel: GEOM_ELI: Device gpt/storage2.eli destroyed. 11:49:51 kernel: GEOM_ELI: Detached gpt/storage2.eli on last close. 11:49:51 kernel: (ada2:ahcich2:0:0:0): Periph destroyed
If you are sure it is what you want to do, you can increase the timeout from 30 seconds: sysctl kern.cam.ada.default_timeout=60
Hello world, The same here with Panasonic Toughbook CF-MX4 and M2 SSD 256GB SAMSUNG drive :-( sysctl kern.cam.ada.default_timeout=100 helps a bit.. but still I cannot even install system on that machine.. Is this a M2 SATA Controller Driver problem or the M2 disk wear is already that bad I should replace a disk? Any hints appreciated :-)
Hello, Same problem :( I just upgraded from FreeNAS 11.3 to TrueNAS 12.0 STABLE. - I have a one SSD with the SO. And one RAIDZ with 3 discs of 12TB. - With FreeNAS I have never had problems, the all discs are new (2 months of lifetime). - As soon as I update to TrueNAS, I get the same errors... ... ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00 CAM status: command timeout Retrying command, 0 more tries remaind Timeout on slot 20 port 0 ... Some help? Regards
Hello guys, good night, the exact same problem happened to me. Motherboard Asus b85m-r r2.0 Processor: Core I7 (I think 10 generation) SSD 240GB WD Green (Installed TrueNAS) 5 HDD 1TB WD Blue (RAID I don't remember which one I put but I know it has a RAID) I do not know if this problem can still be reviewed, in advance thank you very much for the time of the thread. Greetings! :D