Bug 221704 - CAM status: Command timeout on 11.1
Summary: CAM status: Command timeout on 11.1
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.1-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2017-08-21 22:57 UTC by Andrei
Modified: 2018-01-11 00:00 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andrei 2017-08-21 22:57:59 UTC
Hello,
After updating to 11.1 my system can't boot with errors like 
(ada0:ata2:0:0:0): WRITE_DMA48. ACB: 35 00 50 29 10 40 6c  00 0c 00
(ada0:ata2:0:0:0): CAM status: Command timeout
(ada0:ata2:0:0:0): Retrying command
for all 6 sata hdd(mirror from 2 raidz)
Hardware: Supermicro X8DTN+-F / 6xWD1502FYPS-02W3B0 /2xE5649
HDDs connected to sata ports on baseboard.
If I add 
hint.ata.2.mode=PIO4
hint.ata.3.mode=PIO4
hint.ata.4.mode=PIO4
hint.ata.5.mode=PIO4
to device.hints I'm able to boot but performance of IO becomes really disappointing.
If I roll back system to 11.0 all works fine again.
What was done so far:
1) Placed disks with same sata cables to another PC - boots fine
2)Tried separate raid controller on this baseboard - doesn't boots properly
4)Used separate power supply for disk - the same errors.
5)Used another power supply for whole system - no effect, same issue

Also, if I add to /etc/rc.d/zfs something like 
"find / -name something" after "zfs mount -a" it boots too(without modification of device.hints), but after some time igb0(there is torrent traffic on it)will hung and only reboot helps, maybe this is related or this is some separate bug.
I don't remember exact dmeseg message for hung of igb0, but will post update when will face it again.
Comment 1 Andrei 2017-08-22 08:22:26 UTC
Also noticed such thing:
FreeBSD 11.1:
ada0 at ata2 bus 0 scbus0 target 0 lun 0
ada1 at ata2 bus 0 scbus0 target 1 lun 0
ada2 at ata3 bus 0 scbus1 target 0 lun 0
ada3 at ata3 bus 0 scbus1 target 1 lun 0
ada4 at ata4 bus 0 scbus2 target 0 lun 0
ada5 at ata5 bus 0 scbus3 target 0 lun 0

FreeBSD 11.0:
Comment 2 Andrei 2017-08-22 08:23:24 UTC
Please ignore last comment 2017-08-21 22:57:59 UTC 
Accidentally posted
Comment 3 Andrei 2017-09-20 23:04:05 UTC
Just noticed that posted slightly wrong info.
Stripe from 2 raidz used, not mirror
Comment 4 Andrei 2017-10-03 21:01:48 UTC
Small update
Tried to revert all changes for ata driver between 11.0 and 11.1. Unfortunately this didn't help.
probably issue somewhere in cam.
I have a couple of thoughts for further debugging, will try them and update this bug.
Comment 5 Andrei 2018-01-11 00:00:14 UTC
Seems like issue was with HDDs, after replacing 2 hdds and removing from pool another 2, what was going to die I dont't see such errors.
But this is strange why it was able to work fine on 11.0