Bug 243401 - [patch] ahci driver problems with Marvell 88SE9230 (Dell BOSS-S1)
Summary: [patch] ahci driver problems with Marvell 88SE9230 (Dell BOSS-S1)
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.2-RELEASE
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-bugs (Nobody)
URL:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2020-01-16 21:32 UTC by Peter Eriksson
Modified: 2021-01-12 18:06 UTC (History)
5 users (show)

See Also:


Attachments
Patch for AHCI driver to make Dell BOSS-S1 detect unconfigure disks (5.01 KB, patch)
2020-12-21 23:41 UTC, Peter Eriksson
no flags Details | Diff
dmesg.boot (54.99 KB, text/plain)
2021-01-12 18:02 UTC, Peter Eriksson
no flags Details
Version 2 of patch (with debugging printfs) (17.60 KB, patch)
2021-01-12 18:06 UTC, Peter Eriksson
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Eriksson 2020-01-16 21:32:33 UTC
This feels more like a firmware problem than a driver problem but since it apparently works in Windows and Linux, but not in FreeBSD I figured I'd report it here anyway... 

(Probably not meaningful to try to report it to Dell since FreeBSD isn't officially supported by them)


Dell BOSS-S1 (Marvell 88SE9230 based) M.2 "RAID" cards running Dells latest firmware (2.5.13.3022 A06 or 2.5.13.3022 A05) does something strange - when the kernel has loaded (from the drives on this card) it fails to detect the disks ("unconfigured" disks, non-RAID setup) and then root fs mounting fails...

(We have two M.2 SSDs connected to that controller)


With firmware 2.5.13.3016 A04 it gives a couple of errors at kernel boot time, but does detect the disks and the system boots.

With firmware 2.5.13.3011 A03, 2.5.13.2009 A02 or 2.5.13.2008 A01 no errors are printed and the disks are found just fine.

(But there are bugs fixed in the later releases that probably would be nice to have.. I have had M.2 drive go "offline" for me at 2008/A01-firmware so that's why I tried the later versions...


A summary of the (Dell) firmware fixes and my test results:

2.5.13.3022 A06
  Fixes: None
  Enhancement: Added support for 15G platforms

2.5.13.3020 A05
  Status:
  - Does not work, gives errors:
    - 'ahcich16: stopping AHCI engine failed'
  - Detects a 'pass23', but no disk devices:
      pass23 at ahcich16 bus 0 scbus19 target 0 lun 0
      pass23: <Marvell Console 1.01> Removable Processor SCSI device
      pass23: Serial Number HKDP221516WL
      pass23: 150.000MB/s transfers (SATA 1.x, UDMA4, ATAPI 12bytes, PIO 8192bytes)
  Fixes:
  - Fixed an issue where system will hang during
    Boot when PERC is in HBA mode with BOSS-S1
  - When CLI is running, default temporary file
    directory & permission in Linux and ESXi Operating
    systems are changed as appropriate
  Enhancement: N/A

2.5.13.3016 A04
  Status:
  - Works and detects all disks, but gives errors about:
    - 'ahcich14: stopping AHCI engine failed'
    - 'ahcich15: stopping AHCI engine failed'
    - 'ahcich16: stopping AHCI engine failed'
  Fixes:
  - Fixed a behavior of BOSS-S1 firmware incorrectly marking M.2 drive offline/failed
  - Fixed a behavior where ESXi Host goes unresponsive
  - Fixed a behavior where BOSS-S1 Management path will not respond to Management commands
  - Fixed a behavior where BOSS-S1 boot partition becomes inaccessible
  - Fixed a behavior where ESXi host results in PSOD due to unexpected I/O timeout
  - Fixed a behavior where rebuild will not be proceed during error handling condition
  Enhancement:
  - Enhanced/ Added MVCLI events for command timeout
  - Added SLES15 Support

2.5.13.3011 A03
  Status:
  - Works
  Fixes:
  - Fixed M.2 disk failure when medium error is present
  Enhancement:
  - Enhanced medium error handling

2.5.13.2009 A02
  Status:
  - Works
  Fixes:
  - Fixed Sideband functionality issue
  Enhancement:
  - Added support for Rollback of Controller Firmware through iDRAC/LC

2.5.13.2008 A01
  Status:
  - Works
  Initial release


Kernel boot output (the relevant parts) from a firmware 3016 A04 boot:

ahci2: <Marvell 88SE9230 AHCI SATA controller> port 0x8028-0x802f,0x8034-0x8037,0x8020-0x8027,0x8030-0x8033,0x8\
000-0x801f mem 0xb8800000-0xb88007ff at device 0.0 numa-domain 0 on pci9
ahci2: AHCI v1.20 with 3 6Gbps ports, Port Multiplier not supported
ahci2: quirks=0x900<NOBSYRES,ALTSIG>
ahcich14: <AHCI channel> at channel 0 on ahci2
ahcich15: <AHCI channel> at channel 1 on ahci2
ahcich16: <AHCI channel> at channel 2 on ahci2
...
ahcich16: stopping AHCI engine failed
ahcich16: stopping AHCI engine failed
...
ahcich16: stopping AHCI engine failed
ahcich15: stopping AHCI engine failed
ada0 at ahcich14 bus 0 scbus17 target 0 lun 0
ada0: <SSDSCKJB120G7R N201DL43> ACS-3 ATA SATA 3.x device
ada0: Serial Number PHDW817002Z4150A
ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
ada0: Command Queueing enabled
ada0: 114473MB (234441648 512 byte sectors)
ada1 at ahcich15 bus 0 scbus18 target 0 lun 0
ada1: <SSDSCKJB120G7R N201DL43> ACS-3 ATA SATA 3.x device
ada1: Serial Number PHDW817002WC150A
ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
ada1: Command Queueing enabled
ada1: 114473MB (234441648 512 byte sectors)
pass25 at ahcich16 bus 0 scbus19 target 0 lun 0
pass25: <Marvell Console 1.01> Removable Processor SCSI device
pass25: Serial Number HKDP221516WL
pass25: 150.000MB/s transfers (SATA 1.x, UDMA4, ATAPI 12bytes, PIO 8192bytes)


On 3022 the ada0 and ada1 devices never get detected, and it only complains about not being able to stop ahcich16, nothing about 14 & 15.
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2020-01-17 13:00:28 UTC
Cc: most involved committers with AHCI to ask for quick review.
Comment 2 F Sidoli 2020-05-28 07:29:04 UTC
Hi Peter,

FWIW, I'm also having this same issue on a PE R740xd2 server with the BOSS-S1. 

In my particular case, I have the BOSS (in JBOD) and a H730P PERC (in HBA mode)installed. I wanted to replace the latter with an HBA330 as I can't encrypt the disks at the disk level (they're SEDs) without the PERC locking them out and not passing them through to the OS on power cycle. 

Anyway, with the HBA installed the system partially boots and then fails, but it does just fine with the RAID card put back in. 

If I try to do a fresh install with the HBA in then I can't see the BOSS cards at all, so can't install to them. With the RAID card in there's no problem. 

If I firmware patch the BOSS to latest the system just doesn't boot at all regardless of which card I have in. 

Quite what this all means I don't know. I'm not sure if it's a driver issues in FreeNAS or if Dell are breaking things. I suspect a little bit of A and a little bit of B. 

That being said, I have been told by Dell that one of their engineers has managed to get a test system of theirs set up running the latest BOSS firmware and FreeNAS 11.3. Seems to me they have the BOSS in a RAID 1 and the legacy BIOS option set. 

Once verified I will repost here.
Comment 3 Peter Eriksson 2020-12-17 18:44:06 UTC
Just a quick note that this issue is still present in FreeBSD 12.2 and with the latest Dell BOSS-S1 firmware (A07 / 2.5.13.3024)

With a configured RAID volume it works - but not with "raw" disks - they just don't show up.

This makes it impossible to use the BOSS disks for ZFS-mirrored boot/root pools 

(Since it only supports one RAID volume that can be RAID 0 or RAID 1).

- Peter
Comment 4 Peter Eriksson 2020-12-21 23:41:04 UTC
Created attachment 220793 [details]
Patch for AHCI driver to make Dell BOSS-S1 detect unconfigure disks

Please find enclosed a patch that makes (atleast on my Systems) FreeBSD 12.2 detect unconfigured disks on a Dell BOSS-S1 card running the latest Dell firmware (v7).

The patch basically increases the time limit for the loop when initializing/probing the card for devices. It seems with firmware v5 and later the card takes a lot longer to detect disks after a reset.

The patch also adds a "debug.ahci_verbose" flag and adds some more verbose prints so one can "follow" what happens at probe time. 

With firmware v4 (and an older version of the patch without modified timeouts) the probing looks like this:

ahcich14: AHCI reset...
ahcich14: SATA status changed 00000133
ahcich14: SATA connect time=0us status=00000133
ahcich14: AHCI reset: device found
ahcich14: AHCI reset: device ready after 0ms
ahcich15: AHCI reset...
ahcich15: SATA status changed 00000133
ahcich15: SATA connect time=0us status=00000133
ahcich15: AHCI reset: device found
ahcich15: AHCI reset: device ready after 0ms
ahcich16: AHCI reset...
ahcich16: SATA status changed 00000113
ahcich16: SATA connect time=0us status=00000113
ahcich16: AHCI reset: device found
ahcich16: AHCI reset: device ready after 0ms

With the latest firmware and this patch in use:

ahci2: <Marvell 88SE9230 AHCI SATA controller> port 0x7028-0x702f,0x7034-0x7037,0x7020-0x7027,0x7030-0x7033,0x7000-0x701f mem 0xab200000-0xab2007ff at device 0.0 numa-domain 0 on pci6
ahci2: AHCI v1.20 with 3 6Gbps ports, Port Multiplier not supported
ahci2: quirks=0x200900<NOBSYRES,ALTSIG,MRVL_SR_DEL>
ahci2: Caps: 64bit NCQ 6Gbps PMD 32cmd 3ports
ahci2: Caps2:

ahcich14: <AHCI channel> at channel 0 on ahci2
ahcich14: Caps: CPD
ahcich15: <AHCI channel> at channel 1 on ahci2
ahcich15: Caps: CPD
ahcich16: <AHCI channel> at channel 2 on ahci2
ahcich16: Caps: CPD

ahcich14: AHCI reset...
ahcich14: SATA status changed 00000000
ahcich14: SATA status changed 00000001
ahcich14: SATA status changed 00000133
ahcich14: SATA connect timeout time=212300us status=00000133
ahcich14: AHCI reset: device not found

ahcich15: AHCI reset...
ahcich15: SATA status changed 00000000
ahcich15: SATA status changed 00000001
ahcich15: SATA status changed 00000133
ahcich15: SATA connect timeout time=212000us status=00000133
ahcich15: AHCI reset: device not found

ahcich16: AHCI reset...
ahcich16: SATA status changed 00000000
ahcich16: SATA status changed 00000113
ahcich16: SATA connect time=100us status=00000113
ahcich16: AHCI reset: device found
ahcich16: AHCI reset: device ready after 0ms
ahcich16: stopping AHCI engine failed

pass2 at ahcich16 bus 0 scbus18 target 0 lun 0
pass2: <Marvell Console 1.01> Removable Processor SCSI device
pass2: Serial Number HKDP221516WL
pass2: 150.000MB/s transfers (SATA 1.x, UDMA4, ATAPI 12bytes, PIO 8192bytes)

ada0 at ahcich14 bus 0 scbus16 target 0 lun 0
ada0: <MTFDDAV480TDS D3DJ004> ACS-4 ATA SATA 3.x device
ada0: Serial Number 202729652D1E
ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 457862MB (937703088 512 byte sectors)

ada1 at ahcich15 bus 0 scbus17 target 0 lun 0
ada1: <MTFDDAV480TDS D3DJ004> ACS-4 ATA SATA 3.x device
ada1: Serial Number 202729652D52
ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 457862MB (937703088 512 byte sectors)

pass4 at ahcich16 bus 0 scbus18 target 0 lun 0
pass4: <Marvell Console 1.01> Removable Processor SCSI device
pass4: Serial Number HKDP221516WL
pass4: 150.000MB/s transfers (SATA 1.x, UDMA4, ATAPI 12bytes, PIO 8192bytes)

(It still claims no device found but they do show up anyway so the patch probably needs some more fine-tuning, but atleast one can access the disks now...)

Note the: "time=212300us"
Comment 5 Alexander Motin freebsd_committer 2020-12-30 17:40:34 UTC
Peter, your patch is inconsistent in the timeout values.  You are using 10000 on line 2614, but 5000 on lines 2634 and 2638.  That discrepancy may potentially cause random weird effects.  Though it does not explain to me what I see in the provided messages.  I can not match it to the patch provided.  I guess you was running something different.

Also synchronous wait for half a second for every empty port in a system is not great. JFYI IIRC VMware emulates AHCI with 31 port.  Have you tried to measure time between "SATA status changed 00000000" and "SATA status changed 00000001" messages?  It it happen faster it would allow to not increase the timeout on line 2634, not waiting for devices on completely empty ports.
Comment 6 Peter Eriksson 2021-01-01 22:06:00 UTC
Yes, I've since changed my patch a bit so that it:

Only sets the timeout to 5000 (from 1000) if:
  1) quirk AHCI_Q_SLOWDEV (a new one) is set - only set on the Marvell 88SE9230
  2) only does this _after_ the first status change occurs (0x0000000 -> 0x00000001)

Now the trace looks something like this (some more debugging prints added):

ahcich14: AHCI reset...
ahcich14: AHCI engine: stopping
ahcich14: stopping AHCI engine: ci: -1 -> 0 at time 10us
ahcich14: stopping AHCI engine: sact: -1 -> 0 at time 10us
ahcich14: stopping AHCI engine: ccs: -1 -> 0 at time 10us
ahcich14: stopping AHCI engine: cr: -1 -> 0 at time 10us
ahcich14: AHCI engine stopped at time 10us
ahcich14: SATA changed status 0x00000000 -> 0x00000001 at time=100us
ahcich14: SATA changed status 0x00000001 -> 0x00000133 at time=212500us
ahcich14: SATA connect status 0x00000133 at time=212500us
ahcich14: AHCI reset: device found
ahcich14: AHCI reset: device ready after 0ms
ahcich14: AHCI engine(fbs=1): starting

ahcich15: AHCI reset...
ahcich15: AHCI engine: stopping
ahcich15: stopping AHCI engine: ci: -1 -> 0 at time 10us
ahcich15: stopping AHCI engine: sact: -1 -> 0 at time 10us
ahcich15: stopping AHCI engine: ccs: -1 -> 0 at time 10us
ahcich15: stopping AHCI engine: cr: -1 -> 0 at time 10us
ahcich15: AHCI engine stopped at time 10us
ahcich15: SATA changed status 0x00000000 -> 0x00000001 at time=100us
ahcich15: SATA changed status 0x00000001 -> 0x00000133 at time=221400us
ahcich15: SATA connect status 0x00000133 at time=221400us
ahcich15: AHCI reset: device found
ahcich15: AHCI reset: device ready after 0ms
ahcich15: AHCI engine(fbs=1): starting

ahcich16: AHCI reset...
ahcich16: AHCI engine: stopping
ahcich16: stopping AHCI engine: ci: -1 -> 0 at time 10us
ahcich16: stopping AHCI engine: sact: -1 -> 0 at time 10us
ahcich16: stopping AHCI engine: ccs: -1 -> 0 at time 10us
ahcich16: stopping AHCI engine: cr: -1 -> 0 at time 10us
ahcich16: AHCI engine stopped at time 10us
ahcich16: SATA changed status 0x00000000 -> 0x00000113 at time=100us
ahcich16: SATA connect status 0x00000113 at time=100us
ahcich16: AHCI reset: device found
ahcich16: AHCI reset: device ready after 0ms
ahcich16: AHCI engine(fbs=1): starting

Btw,
I've been testing some different variants of settings for this controller - for example I removed the quirk (ALTSIG) to see if that would make any difference but it doesn't seem to matter if it's set or not. Anyone know where that quirk comes from?

I'll upload a cleaned up version of an improved patch soon.
Comment 7 Peter Eriksson 2021-01-01 22:14:07 UTC
Btw, I noticed one other little thing while reading the source code for ahci.c:

ahci_start(ch, fbs) gets called with fbs=1 in all spots in the code, except one spot in ahci_execute_transaction() around line 1650 or so when it receives an ATA_A_RESET - then it sets fbs to 0 (zero). If I'm reading the code correctly then this would cause "FIS-based switched" to be disabled from that time on - if that ever happens? 

Just for testing I changed that to 1 - but I don't see much of a difference in behaviour. Granted I haven't tested _that_ much. Probably unrelated to this issue anyway.
Comment 8 Alexander Motin freebsd_committer 2021-01-04 17:22:33 UTC
I still see the patch from December 21.  Where is the updated version?

It is good that DEV_PRESENT is reported fast enough.  It allows to not increase timeout in case of device absent and so I am thinking about just increasing the timeout slightly instead of adding the quirk.

ALTSIG quirk was required by some early Marvell controllers or firmware versions. I haven't retested it on recent ones, it may no longer be required.

I don't remember why I have disabled the FBS there, its being a while ago. My guess is to avoid other commands from trying to execute during soft reset.  If you look lower, you'll see "Kick controller into sane state and enable FBS." comment, which should call ahci_start(ch, 1) on line 2121 after soft reset complete.

PS: Why do you need AHCI-specific verbosity tunable?  Why not just enable it globally?
Comment 9 Peter Eriksson 2021-01-12 18:02:20 UTC
Sorry, have been busy with another problem (a HP server with a lot of disks panic:ing due to bugs in other parts of the kernel - sigh). I'll get back with a better patch soon.

Anyway, the way I changed it was basically to chnage the loop:

1. DELAY(10) instead of DELAY(100) - it takes about 30us for status to change from 0 -> 1, so no need to wait the full 100us :-)

2. Only change the "full timeout" from 100ms to 500ms after ATA_SS_DEV_MASK has changed from ATA_SS_DET_NO_DEVICE.


My current version of the patch contains a lot of debuging printouts so probably not really good for production use (it makes it easier to watch what's happening though :-)

A sample of the dmesg output:


First an ununes ahci channel/controller without devices:

ahcich13: AHCI engine: stopping
ahcich13: stopping AHCI engine: cr: 1 -> 0 at time 10 us
ahcich13: AHCI engine stopped at time 10 us
ahcich13: ahci_sata_phy_reset: Start
ahcich13: ahci_sata_connect: Start
ahcich13: SATA connect timeout status 0x00000000 at time=10000us
ahcich13: ahci_sata_connect: Done (0)
ahcich13: ahci_sata_phy_reset: Done (0)
ahcich13: AHCI reset: device not found
ahcich13: ahci_reset: Done (ahci_sata_phy_reset failed)


First port on the BOSS:

ahcich14: ahciaction: Calling ahci_reset (XPT_RESET_BUS)
ahcich14: ahci_reset: Start
ahcich14: AHCI reset...
ahcich14: AHCI engine: stopping
ahcich14: stopping AHCI engine: cr: 1 -> 0 at time 10 us
ahcich14: AHCI engine stopped at time 10 us
ahcich14: ahci_sata_phy_reset: Start
ahcich14: ahci_sata_connect: Start
ahcich14: SATA changed status 0x00000000 -> 0x00000001 at time=30us
ahcich14: SATA changed status 0x00000001 -> 0x00000133 at time=211790us
ahcich14: SATA connect status 0x00000133 at time=211790us
ahcich14: ahci_sata_connect: Done (1)
ahcich14: ahci_sata_phy_reset: Done (1)
ahcich14: AHCI reset: device found
ahcich14: AHCI reset: device ready after 0ms
ahcich14: AHCI engine(fbs=1): starting
ahcich14: ahci_start: Done
ahcich14: ahci_reset: Done


Then it resets the second port/channel (and starts doing stuff on the first port at the same time):

ahcich15: ahciaction: Calling ahci_reset (XPT_RESET_BUS)
ahcich15: ahci_reset: Start
ahcich15: AHCI reset...
ahcich15: AHCI engine: stopping
ahcich15: stopping AHCI engine: cr: 1 -> 0 at time 10 us
ahcich15: AHCI engine stopped at time 10 us
ahcich15: ahci_sata_phy_reset: Start
ahcich14: ahci_execute_transaction: Kicking controller into sane state
ahcich14: AHCI engine: stopping
ahcich14: stopping AHCI engine: cr: 1 -> 0 at time 10 us
ahcich14: AHCI engine stopped at time 10 us
ahcich14: ahci_clo: Start
ahcich14: ahci_clo: Done
ahcich14: AHCI engine(fbs=0): starting
ahcich14: ahci_start: Done
ahcich15: ahci_sata_connect: Start
ahcich15: SATA changed status 0x00000000 -> 0x00000001 at time=30us
ahcich14: ahci_end_transaction: Reinit port (eslots=00000004)
ahcich14: AHCI engine: stopping
ahcich15: SATA changed status 0x00000001 -> 0x00000133 at time=220650us
ahcich15: SATA connect status 0x00000133 at time=220650us
ahcich15: ahci_sata_connect: Done (1)
ahcich15: ahci_sata_phy_reset: Done (1)
ahcich15: AHCI reset: device found
ahcich15: AHCI reset: device ready after 0ms
ahcich15: AHCI engine(fbs=1): starting
ahcich15: ahci_start: Done
ahcich15: ahci_reset: Done


And then the third (ses?) port - there is only two ports on this controller:

ahcich16: ahciaction: Calling ahci_reset (XPT_RESET_BUS)
ahcich16: ahci_reset: Start
ahcich16: AHCI reset...
ahcich16: AHCI engine: stopping
ahcich16: stopping AHCI engine: cr: 1 -> 0 at time 10 us
ahcich16: AHCI engine stopped at time 10 us
ahcich16: ahci_sata_phy_reset: Start
ahcich16: ahci_sata_connect: Start
ahcich16: SATA changed status 0x00000000 -> 0x00000113 at time=70us
ahcich16: SATA connect status 0x00000113 at time=70us
ahcich16: ahci_sata_connect: Done (1)
ahcich16: ahci_sata_phy_reset: Done (1)
ahcich16: AHCI reset: device found
ahcich16: AHCI reset: device ready after 0ms
ahcich16: AHCI engine(fbs=1): starting
ahcich16: ahci_start: Done
ahcich16: ahci_reset: Done


But then things are a bit strange - notice the 1s timeouts (I increased the max timeout to 1s in this test boot):

Root mount waiting for: CAM usbus0
uhub0: 26 ports with 26 removable, self powered
ahcich14: stopping AHCI engine: timeout at 1000000 us (cr=1, ccs=0, ci=0, sact=0)
ahcich14: ahci_clo: Start
ahcich14: ahci_clo: Done
ahcich14: AHCI engine(fbs=1): starting
ahcich14: ahci_start: Done
ahcich15: ahci_execute_transaction: Kicking controller into sane state
ahcich15: AHCI engine: stopping
ugen0.2: <Kingston DataTraveler 2.0> at usbus0
umass0 numa-domain 0 on uhub0
umass0: <Kingston DataTraveler 2.0, class 0/0, rev 2.00/1.00, addr 1> on usbus0
umass0:  SCSI over Bulk-Only; quirks = 0xc000
umass0:20:0: Attached to scbus20
Root mount waiting for: CAM usbus0
ahcich15: stopping AHCI engine: timeout at 1000000 us (cr=1, ccs=0, ci=0, sact=0)
ahcich15: ahci_clo: Start
ahcich15: ahci_clo: Done
ahcich15: AHCI engine(fbs=0): starting
ahcich15: ahci_start: Done
ahcich15: ahci_end_transaction: Reinit port (eslots=00000004)
ahcich15: AHCI engine: stopping
ugen0.3: <vendor 0x1604 product 0x10c0> at usbus0
uhub1 numa-domain 0 on uhub0
uhub1: <vendor 0x1604 product 0x10c0, class 9/0, rev 2.00/0.00, addr 2> on usbus0
Root mount waiting for: CAM usbus0
ahcich15: stopping AHCI engine: timeout at 1000000 us (cr=1, ccs=0, ci=0, sact=0)
ahcich15: ahci_clo: Start
ahcich15: ahci_clo: Done
ahcich15: AHCI engine(fbs=1): starting
ahcich15: ahci_start: Done
ahcich15: ahci_execute_transaction: Kicking controller into sane state
ahcich15: AHCI engine: stopping
Root mount waiting for: CAM usbus0
ahcich15: stopping AHCI engine: timeout at 1000000 us (cr=1, ccs=0, ci=0, sact=0)
ahcich15: ahci_clo: Start
ahcich15: ahci_clo: Done
ahcich15: AHCI engine(fbs=0): starting
ahcich15: ahci_start: Done
uhub1: 4 ports with 4 removable, self powered
Root mount waiting for: CAM usbus0
ugen0.4: <vendor 0x1604 product 0x10c0> at usbus0
uhub2 numa-domain 0 on uhub1
uhub2: <vendor 0x1604 product 0x10c0, class 9/0, rev 2.00/0.00, addr 3> on usbus0
Root mount waiting for: CAM usbus0
ahcich15: ahci_end_transaction: Reinit port (eslots=00000010)
ahcich15: AHCI engine: stopping
Root mount waiting for: CAM usbus0
hcich15: stopping AHCI engine: timeout at 1000000 us (cr=1, ccs=0, ci=0, sact=0)
ahcich15: ahci_clo: Start
ahcich15: ahci_clo: Done
ahcich15: AHCI engine(fbs=1): starting
ahcich15: ahci_start: Done
ahcich16: ahci_execute_transaction: Kicking controller into sane state
ahcich16: AHCI engine: stopping
ahcich16: stopping AHCI engine: cr: 1 -> 0 at time 10 us
ahcich16: AHCI engine stopped at time 10 us
ahcich16: ahci_clo: Start
ahcich16: ahci_clo: Done
ahcich16: AHCI engine(fbs=0): starting
ahcich16: ahci_start: Done
ahcich16: ahci_end_transaction: Reinit port (eslots=00000004)
ahcich16: AHCI engine: stopping
uhub2: 4 ports with 4 removable, self powered
Root mount waiting for: CAM usbus0
ugen0.5: <vendor 0x1604 product 0x10c0> at usbus0
uhub3 numa-domain 0 on uhub1
uhub3: <vendor 0x1604 product 0x10c0, class 9/0, rev 2.00/0.00, addr 4> on usbus0
ahcich16: stopping AHCI engine: timeout at 1000000 us (cr=1, ccs=0, ci=0, sact=0)
ahcich16: ahci_clo: Start
ahcich16: ahci_clo: Done
ahcich16: AHCI engine(fbs=1): starting
ahcich16: ahci_start: Done
ahcich16: ahci_execute_transaction: Kicking controller into sane state
ahcich16: AHCI engine: stopping
Root mount waiting for: CAM usbus0
ahcich16: stopping AHCI engine: timeout at 1000000 us (cr=1, ccs=0, ci=0, sact=0)
ahcich16: ahci_clo: Start
ahcich16: ahci_clo: Done
ahcich16: AHCI engine(fbs=0): starting
ahcich16: ahci_start: Done
Root mount waiting for: CAM usbus0
uhub3: 4 ports with 4 removable, self powered
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
ahcich16: ahci_end_transaction: Reinit port (eslots=00000010)
ahcich16: AHCI engine: stopping
Root mount waiting for: CAM
ahcich16: stopping AHCI engine: timeout at 1000000 us (cr=1, ccs=0, ci=0, sact=0)
ahcich16: ahci_clo: Start
ahcich16: ahci_clo: Done
ahcich16: AHCI engine(fbs=1): starting
ahcich16: ahci_start: Done


However, eventually things seem to work anyway. I've attached the full dmesg.boot file
Comment 10 Peter Eriksson 2021-01-12 18:02:56 UTC
Created attachment 221500 [details]
dmesg.boot
Comment 11 Peter Eriksson 2021-01-12 18:06:55 UTC
Created attachment 221502 [details]
Version 2 of patch (with debugging printfs)

A new version of the patch (still with a lot of debugging printf's - I'll see if I can find time to create a more "clean" version with just the DELAY(10) and "long timeout, but only after state 0 -> 1" fixes.