Bug 243401 - ahci driver problems with Marvell 88SE9230 (Dell BOSS-S1)
Summary: ahci driver problems with Marvell 88SE9230 (Dell BOSS-S1)
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.3-RELEASE
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-bugs mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-01-16 21:32 UTC by Peter Eriksson
Modified: 2020-01-20 15:57 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Eriksson 2020-01-16 21:32:33 UTC
This feels more like a firmware problem than a driver problem but since it apparently works in Windows and Linux, but not in FreeBSD I figured I'd report it here anyway... 

(Probably not meaningful to try to report it to Dell since FreeBSD isn't officially supported by them)


Dell BOSS-S1 (Marvell 88SE9230 based) M.2 "RAID" cards running Dells latest firmware (2.5.13.3022 A06 or 2.5.13.3022 A05) does something strange - when the kernel has loaded (from the drives on this card) it fails to detect the disks ("unconfigured" disks, non-RAID setup) and then root fs mounting fails...

(We have two M.2 SSDs connected to that controller)


With firmware 2.5.13.3016 A04 it gives a couple of errors at kernel boot time, but does detect the disks and the system boots.

With firmware 2.5.13.3011 A03, 2.5.13.2009 A02 or 2.5.13.2008 A01 no errors are printed and the disks are found just fine.

(But there are bugs fixed in the later releases that probably would be nice to have.. I have had M.2 drive go "offline" for me at 2008/A01-firmware so that's why I tried the later versions...


A summary of the (Dell) firmware fixes and my test results:

2.5.13.3022 A06
  Fixes: None
  Enhancement: Added support for 15G platforms

2.5.13.3020 A05
  Status:
  - Does not work, gives errors:
    - 'ahcich16: stopping AHCI engine failed'
  - Detects a 'pass23', but no disk devices:
      pass23 at ahcich16 bus 0 scbus19 target 0 lun 0
      pass23: <Marvell Console 1.01> Removable Processor SCSI device
      pass23: Serial Number HKDP221516WL
      pass23: 150.000MB/s transfers (SATA 1.x, UDMA4, ATAPI 12bytes, PIO 8192bytes)
  Fixes:
  - Fixed an issue where system will hang during
    Boot when PERC is in HBA mode with BOSS-S1
  - When CLI is running, default temporary file
    directory & permission in Linux and ESXi Operating
    systems are changed as appropriate
  Enhancement: N/A

2.5.13.3016 A04
  Status:
  - Works and detects all disks, but gives errors about:
    - 'ahcich14: stopping AHCI engine failed'
    - 'ahcich15: stopping AHCI engine failed'
    - 'ahcich16: stopping AHCI engine failed'
  Fixes:
  - Fixed a behavior of BOSS-S1 firmware incorrectly marking M.2 drive offline/failed
  - Fixed a behavior where ESXi Host goes unresponsive
  - Fixed a behavior where BOSS-S1 Management path will not respond to Management commands
  - Fixed a behavior where BOSS-S1 boot partition becomes inaccessible
  - Fixed a behavior where ESXi host results in PSOD due to unexpected I/O timeout
  - Fixed a behavior where rebuild will not be proceed during error handling condition
  Enhancement:
  - Enhanced/ Added MVCLI events for command timeout
  - Added SLES15 Support

2.5.13.3011 A03
  Status:
  - Works
  Fixes:
  - Fixed M.2 disk failure when medium error is present
  Enhancement:
  - Enhanced medium error handling

2.5.13.2009 A02
  Status:
  - Works
  Fixes:
  - Fixed Sideband functionality issue
  Enhancement:
  - Added support for Rollback of Controller Firmware through iDRAC/LC

2.5.13.2008 A01
  Status:
  - Works
  Initial release


Kernel boot output (the relevant parts) from a firmware 3016 A04 boot:

ahci2: <Marvell 88SE9230 AHCI SATA controller> port 0x8028-0x802f,0x8034-0x8037,0x8020-0x8027,0x8030-0x8033,0x8\
000-0x801f mem 0xb8800000-0xb88007ff at device 0.0 numa-domain 0 on pci9
ahci2: AHCI v1.20 with 3 6Gbps ports, Port Multiplier not supported
ahci2: quirks=0x900<NOBSYRES,ALTSIG>
ahcich14: <AHCI channel> at channel 0 on ahci2
ahcich15: <AHCI channel> at channel 1 on ahci2
ahcich16: <AHCI channel> at channel 2 on ahci2
...
ahcich16: stopping AHCI engine failed
ahcich16: stopping AHCI engine failed
...
ahcich16: stopping AHCI engine failed
ahcich15: stopping AHCI engine failed
ada0 at ahcich14 bus 0 scbus17 target 0 lun 0
ada0: <SSDSCKJB120G7R N201DL43> ACS-3 ATA SATA 3.x device
ada0: Serial Number PHDW817002Z4150A
ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
ada0: Command Queueing enabled
ada0: 114473MB (234441648 512 byte sectors)
ada1 at ahcich15 bus 0 scbus18 target 0 lun 0
ada1: <SSDSCKJB120G7R N201DL43> ACS-3 ATA SATA 3.x device
ada1: Serial Number PHDW817002WC150A
ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
ada1: Command Queueing enabled
ada1: 114473MB (234441648 512 byte sectors)
pass25 at ahcich16 bus 0 scbus19 target 0 lun 0
pass25: <Marvell Console 1.01> Removable Processor SCSI device
pass25: Serial Number HKDP221516WL
pass25: 150.000MB/s transfers (SATA 1.x, UDMA4, ATAPI 12bytes, PIO 8192bytes)


On 3022 the ada0 and ada1 devices never get detected, and it only complains about not being able to stop ahcich16, nothing about 14 & 15.
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2020-01-17 13:00:28 UTC
Cc: most involved committers with AHCI to ask for quick review.