This is a repost of unresolved bug #240145 ***HARDWARE IS VERIFIED OK BY ZFS SCRUB ON CENTOS 8.4 WITH 0 ERRORS*** Hardware: HPE dl180 g10 HPE SmartArray p816i 12x Seagate ST16000NM002G All combinations of BSD/driver/firmware are affected up to and including: FreeBSD 13.0 Release/Stable Microsemi smartpqi driver v4130 updated 8/5/2021 HPE SmartArray Firmware 3.53 The only error displayed/logged is of this form: [167] [ERROR]::[178:655.0][0,64,0][CPU 15][pqi_map_request][540]:bus_dmamap_load_ccb failed = 36 count = 1044480 [167] [WARN]:[178:655.0][CPU 15][pqisrc_io_start][794]:In Progress on 64 This is a 100% reproduceable issue - sometimes within first 1% of scrub progress, but never more than 8-9%.
Created attachment 227287 [details] log/messages See last lines
This is split off from https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=240145 Please include papani.srikanth, emaste and imp in CCs so that Papani can help gather debug information or a never version of the driver.
Is there a newer driver ready to integrate?
Is there a new version of the smartpqi driver ready to integrate into FreeBSD?
Hi, any updates on this? I'm using three Adaptec 1100-4i HBAs each connected to a seperate SuperMicro BPN-SAS3-216EL1 backplane for a total of 72 bays. My zpool is created with 67 SSDs in a simple "RAID0"-config: zpool create atime=off mountpoint=none test da0 [..] da66 and each time I realiably can lockup a random controller by creating enough load using: dd if=/dev/zero of=/mnt/test.dat bs=100M and after a time of five minutes a parallel zpool scrub test with following kernel messages: [...heartbeat...] controller is offline [...take_ctrl-offline...] Controller FW is not runniung. Lockup code = 1403a The Adaptec HBA shows after reboot: 1719-Slot 10 A controller failure event occurred prior to this power-up Previous lock up code=0001403A POST Messages Ended. Press any key to continue. I even tried only one Adaptec 1100 HBA and the three backplanes as a cascade; but the controller locks up using this config as well... TIA and BR, Nils
All - resolution can be found in thread for bug #240145.
New driver fixes this