Bug 270089

Summary: mpr: panic in mpr_complete_command during zpool import
Product: Base System Reporter: Dan Kotowski <dan.kotowski>
Component: kernAssignee: freebsd-fs (Nobody) <fs>
Status: Open ---    
Severity: Affects Only Me CC: asomers, grahamperrin, imp, marklmi26-fbsd
Priority: --- Keywords: crash
Version: CURRENT   
Hardware: arm64   
OS: Any   

Description Dan Kotowski 2023-03-10 16:07:20 UTC
# zpool import tank
panic: command not inqueue, state = 0

cpuid = 12
time = 946693714
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x13c
panic() at panic+0x44
mpr_complete_command() at mpr_complete_command+0x12c
mpr_intr_locked() at mpr_intr_locked+0x7c
mpr_intr_msi() at mpr_intr_msi+0x58
ithread_loop() at ithread_loop+0x2a0
fork_exit() at fork_exit+0x74
fork_trampoline() at fork_trampoline+0x14
KDB: enter: panic
[ thread pid 12 tid 100154 ]
Stopped at      kdb_enter+0x44: undefined       f900027f
Comment 1 Dan Kotowski 2023-03-10 16:07:30 UTC
# mprutil show all 
Adapter:
mpr0 Adapter:
       Board Name: SAS9311-8i
   Board Assembly: H3-25461-02H
        Chip Name: LSISAS3008
    Chip Revision: ALL
    BIOS Revision: 8.37.00.00
Firmware Revision: 16.00.10.00
  Integrated RAID: no
         SATA NCQ: ENABLED
 PCIe Width/Speed: x8 (8.0 GB/sec)
        IOC Speed: Full
      Temperature: 81 C

PhyNum  CtlrHandle  DevHandle  Disabled  Speed   Min    Max    Device
0       0001        0009       N         6.0     3.0    12     SAS Initiator 
1       0002        000a       N         6.0     3.0    12     SAS Initiator 
2       0003        000b       N         6.0     3.0    12     SAS Initiator 
3       0004        000c       N         6.0     3.0    12     SAS Initiator 
4                              N                 3.0    12     SAS Initiator 
5                              N                 3.0    12     SAS Initiator 
6       0005        000d       N         6.0     3.0    12     SAS Initiator 
7                              N                 3.0    12     SAS Initiator 

Devices:
B____T    SAS Address      Handle  Parent    Device        Speed Enc  Slot  Wdt
00   03   4433221100000000 0009    0001      SATA Target   6.0   0001 03    1
00   02   4433221101000000 000a    0002      SATA Target   6.0   0001 02    1
00   00   4433221102000000 000b    0003      SATA Target   6.0   0001 00    1
00   01   4433221103000000 000c    0004      SATA Target   6.0   0001 01    1
00   04   4433221106000000 000d    0005      SATA Target   6.0   0001 04    1

Enclosures:
Slots      Logical ID     SEPHandle  EncHandle    Type
  08    500605b00993f2f0               0001     Direct Attached SGPIO

Expanders:
NumPhys   SAS Address     DevHandle   Parent  EncHandle  SAS Level
Comment 2 Dan Kotowski 2023-03-10 16:08:35 UTC
# zpool import
   pool: tank
     id: 4890533244228042504
  state: ONLINE
status: Some supported features are not enabled on the pool.
        (Note that they may be intentionally disabled if the
        'compatibility' property is set.)
 action: The pool can be imported using its name or numeric identifier, though
        some features will not be available without an explicit 'zpool upgrade'.
 config:

        tank                            ONLINE
          raidz1-0                      ONLINE
            diskid/DISK-S6EPNE0T600261  ONLINE
            diskid/DISK-S6EPNE0T600275  ONLINE
            diskid/DISK-S6EPNE0TA03058  ONLINE
            diskid/DISK-S6EPNE0TA03099  ONLINE
            diskid/DISK-S6EPNE0TA03107  ONLINE
# zpool import tank
panic: command not inqueue, state = 0

cpuid = 12
time = 946694892
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x13c
panic() at panic+0x44
mpr_complete_command() at mpr_complete_command+0x12c
mpr_intr_locked() at mpr_intr_locked+0x7c
mpr_intr_msi() at mpr_intr_msi+0x58
ithread_loop() at ithread_loop+0x2a0
fork_exit() at fork_exit+0x74
fork_trampoline() at fork_trampoline+0x14
KDB: enter: panic
[ thread pid 12 tid 100154 ]
Stopped at      kdb_enter+0x44: undefined       f900027f
db> reset
Uptime: 16m2s
mpr0: Sending StopUnit: path (xpt0:mpr0:0:0:ffffffff):  handle 11
mpr0: Incrementing SSU count
mpr0: Sending StopUnit: path (xpt0:mpr0:0:1:ffffffff):  handle 12
mpr0: Incrementing SSU count
mpr0: Sending StopUnit: path (xpt0:mpr0:0:2:ffffffff):  handle 10
mpr0: Incrementing SSU count
mpr0: Sending StopUnit: path (xpt0:mpr0:0:3:ffffffff):  handle 9
mpr0: Incrementing SSU count
mpr0: Sending StopUnit: path (xpt0:mpr0:0:4:ffffffff):  handle 13
mpr0: Incrementing SSU count
panic: command not inqueue, state = 0

cpuid = 12
time = 946694892
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x13c
panic() at panic+0x44
mpr_complete_command() at mpr_complete_command+0x12c
mpr_intr_locked() at mpr_intr_locked+0x7c
xpt_sim_poll() at xpt_sim_poll+0x54
mprsas_ir_shutdown() at mprsas_ir_shutdown+0x458
	kern_reboot() at kern_reboot+0x6c4
	db_reset() at db_reset+0xd0
	db_command() at db_command+0x2d8
	db_command_loop() at db_command_loop+0x54
	db_trap() at db_trap+0xf8
	kdb_trap() at kdb_trap+0x28c
	handle_el1h_sync() at handle_el1h_sync+0x10
	--- exception, esr 0
	(null)() at 0
	KDB: enter: panic
	[ thread pid 12 tid 100154 ]
Stopped at      kdb_enter+0x44: undefined       ff900027f900027f
db> reboot
Uptime: 16m2s
mpr0: Sending StopUnit: path (xpt0:mpr0:0:0:ffffffff):  handle 11
mpr0: Incrementing SSU count
mpr0: Sending StopUnit: path (xpt0:mpr0:0:1:ffffffff):  handle 12
mpr0: Incrementing SSU count
mpr0: Sending StopUnit: path (xpt0:mpr0:0:2:ffffffff):  handle 10
mpr0: Incrementing SSU count
mpr0: Sending StopUnit: path (xpt0:mpr0:0:3:ffffffff):  handle 9
mpr0: Incrementing SSU count
mpr0: Sending StopUnit: path (xpt0:mpr0:0:4:ffffffff):  handle 13
mpr0: Incrementing SSU count
mpr0: Decrementing SSU count.
mpr0: Decrementing SSU count.
mpr0: Decrementing SSU count.
mpr0: Decrementing SSU count.
mpr0: Completing stop unit for (xpt0:mpr0:0:3:ffffffff): 
mpr0: Completing stop unit for (xpt0:mpr0:0:1:ffffffff): 
mpr0: Completing stop unit for (xpt0:mpr0:0:4:ffffffff): 
mpr0: Completing stop unit for (xpt0:mpr0:0:2:ffffffff): 
mpr0: Decrementing SSU count.
mpr0: Completing
Comment 3 Dan Kotowski 2023-03-10 16:09:38 UTC
# uname -mv
FreeBSD 14.0-CURRENT main-n261327-c237c10a2346 GENERIC arm64
Comment 4 Dan Kotowski 2023-03-10 16:35:46 UTC
# mprutil show adapters
Device Name           Chmpr0: mpr_user_pass_thru: user reply buffer (64) smaller than returned buffer (68)
ip Name        Board Name        Firmware
/dev/mpr0             LSISAS3008       SAS9311-8i        10000a00

# mprutil show adapter
mpr0 Adapter:
mpr0: mpr_user_pass_thru: user reply buffer (64) smaller than returned buffer (68)
       Board Name: SAS9311-8i
   Board Assembly: H3-25461-02H
        Chip Name: LSISAS3008
    Chip Revision: ALL
    BIOS Revision: 8.37.00.00
Firmware Revision: 16.00.10.00
  Integrated RAID: no
         SATA NCQ: ENABLED
 PCIe Width/Speed: x8 (8.0 GB/sec)
        IOC Speed: Full
      Temperature: 79 C

PhyNum  CtlrHandle  DevHandle  Disabled  Speed   Min    Max    Device
0       0001        0009       N         6.0     3.0    12     SAS Initiator 
1       0002        000a       N         6.0     3.0    12     SAS Initiator 
2       0003        000b       N         6.0     3.0    12     SAS Initiator 
3       0004        000c       N         6.0     3.0    12     SAS Initiator 
4                              N                 3.0    12     SAS Initiator 
5                              N                 3.0    12     SAS Initiator 
6       0005        000d       N         6.0     3.0    12     SAS Initiator 
7                              N                 3.0    12     SAS Initiator 

# mprutil show iocfacts
mpr0: mpr_user_pass_thru: user reply buffer (64) smaller than returned buffer (68)
 s5
           MsgLength: 17
            Function: 0x3
       HeaderVersion: 50,00
           IOCNumber: 0
            MsgFlags: 0x0
               VP_ID: 0
               VF_ID: 0
       IOCExceptions: 0
           IOCStatus: 0
          IOCLogInfo: 0x0
       MaxChainDepth: 128
             WhoInit: 0x4
       NumberOfPorts: 1
      MaxMSIxVectors: 96
       RequestCredit: 9856
           ProductID: 0x2221
     IOCCapabilities: 0x7a85c <ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,MSIXIndex,HostDisc,FastPath,RDPQArray>
           FWVersion: 16.00.10.00
 IOCRequestFrameSize: 32
 MaxChainSegmentSize: 8
       MaxInitiators: 32
          MaxTargets: 1024
     MaxSasExpanders: 42
       MaxEnclosures: 43
       ProtocolFlags: 0x3 <ScsiTarget,ScsiInitiator>
  HighPriorityCredit: 104
MaxRepDescPostQDepth: 65504
      ReplyFrameSize: 32
          MaxVolumes: 0
        MaxDevHandle: 1106
MaxPersistentEntries: 128
        MinDevHandle: 9
 CurrentHostPageSize: 0
Comment 5 Dan Kotowski 2023-03-10 16:40:19 UTC
# pciconf -l -BbcevV mpr0
mpr0@pci4:1:0:0:        class=0x010700 rev=0x02 hdr=0x00 vendor=0x1000 device=0x0097 subvendor=0x1000 subdevice=0x30e0
    vendor     = 'Broadcom / LSI'
    device     = 'SAS3008 PCI-Express Fusion-MPT SAS-3'
    class      = mass storage
    subclass   = SAS
    bar   [10] = type I/O Port, range 32, base r, size 256, disabled
    bar   [14] = type Memory, range 64, base rx40040000, size 65536, enabled
    bar   [1c] = type Memory, range 64, base rx40000000, size 262144, enabled
    cap 01[50] = powerspec 3  supports D0 D1 D2 D3  current D0
    cap 10[68] = PCI-Express 2 endpoint max data 128(4096) FLR RO NS
                 max read 512
                 link x8(x8) speed 8.0(8.0)
    cap 05[a8] = MSI supports 1 message, 64 bit, vector masks 
    cap 11[c0] = MSI-X supports 96 messages, enabled
                 Table in map 0x14[0xe000], PBA in map 0x14[0xf000]
    ecap 0001[100] = AER 2 0 fatal 0 non-fatal 0 corrected
    ecap 0019[1e0] = PCIe Sec 1 lane errors 0
    ecap 0004[1c0] = Power Budgeting 1
    ecap 0016[190] = DPA 1
    ecap 000e[148] = ARI 1
Comment 6 Dan Kotowski 2023-03-11 18:17:25 UTC
Seems to exist for mps as well

# zpool import tank
panic: command not inqueue, state = 0

cpuid = 12
time = 946689084
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x13c
panic() at panic+0x44
mps_complete_command() at mps_complete_command+0x130
mps_intr_locked() at mps_intr_locked+0xc4
mps_intr_msi() at mps_intr_msi+0x58
ithread_loop() at ithread_loop+0x2a0
fork_exit() at fork_exit+0x74
fork_trampoline() at fork_trampoline+0x14
KDB: enter: panic
[ thread pid 12 tid 100154 ]
Stopped at      kdb_enter+0x44: undefined       f900027f
Comment 7 Dan Kotowski 2023-03-12 13:31:22 UTC
I have since brought the 9300-8i controller up to 16.00.12.00 but the problem persists.

https://www.truenas.com/community/resources/lsi-9300-xx-firmware-update.145/

Interestingly, I can reproduce the issue with the 9300-8i on an amd64 system also running CURRENT, but the SAS9211-8i on 20.00.04.00 only fails on arm64 (works on amd64).
Comment 8 Dan Kotowski 2023-03-14 16:17:44 UTC
The following seems to get emitted to messages upon almost any command:

kernel: [1135] mpr0: mpr_user_pass_thru: user reply buffer (64) smaller than returned buffer (68)
Comment 9 Warner Losh freebsd_committer freebsd_triage 2023-03-14 16:41:31 UTC
(In reply to Dan Kotowski from comment #8)

You an 100% ignore that message.
It's saying there's more information that could be copied to userland based on the iofacts, so userland should have given it a bigger buffer.

However, nothing in userland that we have, or that I'm aware of uses those extra 4 bytes to make decisions or report anything.

The message is eliminated in future versions of FreeBSD.
Comment 10 Alan Somers freebsd_committer freebsd_triage 2023-03-14 17:14:32 UTC
(In reply to Warner Losh from comment #9)
> The message is eliminated in future versions of FreeBSD.

Actually, not yet.  It's still waiting for review. https://reviews.freebsd.org/D38739 , if you're interested.
Comment 11 Dan Kotowski 2023-03-14 18:47:34 UTC
Is it possible this is related to a PCI bus coherency issue?

The platform is a SolidRun Honeycomb LX2K, which is built around the Cortex-A72 but with a full PCIe bus. Except that testing on Linux found an issue in the PCI coherency.

Test: https://gist.github.com/jnettlet/80f8d09d01c0dc0ffc0122f36ed78de6

glibc patch: https://gist.github.com/jnettlet/f6f8b49bb7c731255c46f541f875f436

I checked our arm64 memcpy.S and sure enough we use the same ordering as Linux used to but that Jon patched out.
Comment 12 Dan Kotowski 2023-03-14 18:50:33 UTC
diff --git a/sys/arm64/arm64/memcpy.S b/sys/arm64/arm64/memcpy.S
index d5fbfa64e0fa..d65910a0a0c8 100644
--- a/sys/arm64/arm64/memcpy.S
+++ b/sys/arm64/arm64/memcpy.S
@@ -132,12 +132,12 @@ L(copy128):
        stp     G_l, G_h, [dstend, -64]
        stp     H_l, H_h, [dstend, -48]
 L(copy96):
+       stp     C_l, C_h, [dstend, -32]
+       stp     D_l, D_h, [dstend, -16]
        stp     A_l, A_h, [dstin]
        stp     B_l, B_h, [dstin, 16]
        stp     E_l, E_h, [dstin, 32]
        stp     F_l, F_h, [dstin, 48]
-       stp     C_l, C_h, [dstend, -32]
-       stp     D_l, D_h, [dstend, -16]
        ret

        .p2align 4
@@ -232,10 +232,10 @@ L(copy64_from_start):
        stp     C_l, C_h, [dstend, -48]
        ldp     C_l, C_h, [src]
        stp     D_l, D_h, [dstend, -64]
-       stp     G_l, G_h, [dstin, 48]
-       stp     A_l, A_h, [dstin, 32]
-       stp     B_l, B_h, [dstin, 16]
        stp     C_l, C_h, [dstin]
+       stp     B_l, B_h, [dstin, 16]
+       stp     A_l, A_h, [dstin, 32]
+       stp     G_l, G_h, [dstin, 48]
        ret
 EEND(memmove)
 END(memcpy)
Comment 13 Mark Millard 2023-03-14 19:21:19 UTC
(In reply to Dan Kotowski from comment #12)

Are you able to test of the patch sidesteps the issue for you?
Comment 14 Dan Kotowski 2023-03-14 20:57:44 UTC
Tested, no it does not sidestep :(
Comment 15 Warner Losh freebsd_committer freebsd_triage 2023-03-14 21:37:51 UTC
This driver works well on amd64... If we're failing on arm64, that likely means that we've missed some busdma thing to ensure things are coherent in that environment... That's why the controller is likely finding completed items not in the queue... Or there's some race between inserting it into the queue, setting its state and reading it back out of the pending list when we're trying to complete... Or we're seeing 'stale' data for the completion records, though I suspect that's a bit less likely...

I don't have an arm64 machine I can add in mpr or mps cards. The fact you see it on both likely means the same systemic error was made in both places (mpr is a copy of mps that's been augmented for mpr's new features).
Comment 16 Dan Kotowski 2023-03-16 22:58:36 UTC
Is it possible that it has to do with the msleep timeouts?

I noticed that after almost every panic, passing "reset" to kdb will NOT reset the system, but mpr0 will dump a bit more to the console and then take me to another kdb prompt. Eg comment #c2 we can see mpr0 still emitting more to the console after both a "db> reset" and a subsequent "db> reboot".

Even more so when I enable full debugging 0x07ff, the zpool import actually works! And I am able to interact with non-zfs drives a little bit as well. All of this to me points to bad timeouts somewhere.
Comment 17 Dan Kotowski 2023-04-11 13:30:08 UTC
A suggestion from Jon Nettleton at SolidRun is to disable PCI ASPM - this seems to be a known issue elsewhere?
Comment 18 Dan Kotowski 2024-04-08 17:30:20 UTC
As it turns out my drives do not support NCQed TRIM. This was fixed for ada by review D43961 but not for da.

From base b7dce5b "scsi_da: add 4K quirks for Samsung SSD 860 and 870":
```
diff --git a/sys/cam/scsi/scsi_da.c b/sys/cam/scsi/scsi_da.c
index d578e4ccb712..9b3d706d6168 100644
--- a/sys/cam/scsi/scsi_da.c
+++ b/sys/cam/scsi/scsi_da.c
@@ -1397,6 +1397,22 @@ static struct da_quirk_entry da_quirk_table[] =
 	},
 	{
 		/*
+		 * Samsung 860 SSDs
+		 * 4k optimised & trim only works in 4k requests + 4k aligned
+		 */
+		{ T_DIRECT, SIP_MEDIA_FIXED, "ATA", "Samsung SSD 860*", "*" },
+		/*quirks*/DA_Q_4K
+	},
+	{
+		/*
+		 * Samsung 870 SSDs
+		 * 4k optimised & trim only works in 4k requests + 4k aligned
+		 */
+		{ T_DIRECT, SIP_MEDIA_FIXED, "ATA", "Samsung SSD 870*", "*" },
+		/*quirks*/DA_Q_4K
+	},
+	{
+		/*
 		 * Samsung 843T Series SSDs (MZ7WD*)
 		 * Samsung PM851 Series SSDs (MZ7TE*)
 		 * Samsung PM853T Series SSDs (MZ7GE*)
```

From base c01af41 "ata_da: add quirk to disable NCQ TRIM for Samsung 860/870 SSDs":
```
diff --git a/sys/cam/ata/ata_da.c b/sys/cam/ata/ata_da.c
index f5d3aeca9329..d4a591943307 100644
--- a/sys/cam/ata/ata_da.c
+++ b/sys/cam/ata/ata_da.c
@@ -729,6 +729,22 @@ static struct ada_quirk_entry ada_quirk_table[] =
 	},
 	{
 		/*
+		 * Samsung 860 SSDs
+		 * 4k optimised, NCQ TRIM broken (normal TRIM fine)
+		 */
+		{ T_DIRECT, SIP_MEDIA_FIXED, "*", "Samsung SSD 860*", "*" },
+		/*quirks*/ADA_Q_4K | ADA_Q_NCQ_TRIM_BROKEN
+	},
+	{
+		/*
+		 * Samsung 870 SSDs
+		 * 4k optimised, NCQ TRIM broken (normal TRIM fine)
+		 */
+		{ T_DIRECT, SIP_MEDIA_FIXED, "*", "Samsung SSD 870*", "*" },
+		/*quirks*/ADA_Q_4K | ADA_Q_NCQ_TRIM_BROKEN
+	},
+	{
+		/*
 		 * Samsung SM863 Series SSDs (MZ7KM*)
 		 * 4k optimised, NCQ believed to be working
 		 */
```

At least some of my affected drives are in the Samsung SSD 860 family.

I do not see anything like NCQ_TRIM_BROKEN in scsi_da.c and we would probably need to implement there.
Comment 19 Warner Losh freebsd_committer freebsd_triage 2024-04-08 18:22:18 UTC
A few problems with your ncq trim theory.

(1) scsi_da doesn't implement ncq trim at all. Until mpi3mr was imported, there was nothing in the tree that could create the necessary ATA command with the extra registers apart from ahci (which uses ata_da).
(2) mpr can't possibly send ncq trims if da were generating them
and
(3) import is a read intensive operation, so is not doing trims and will do minimal writes.

As an aside: We likely should just assume the 4k quirk always. That would eliminate 90% of the quirks we have if we also stop doing READ6/WRITE6 commands entirely (they are a compat hack for SASI and READ10 was in SCSI1, though not universally working, SCSI1 drives are not relevant today, certainly not ultra-low capacity ones that were quirky at the time). But that's a different issue. But that's not the main issue here.

I fixed a lot of 'state machine' bugs, which this panic as, and Scott Long fixed even more before I did. Those changes should have been pushed upstream several years prior to the uname date in this bug report. Since this is on an ARM server, there may be something subtle there due to arm's weaker memory model than amd64 that's causing this. My testing of mpr on aarch64 has been light since we don't use it at $WORK and my aarch64 chassis that I have don't have slots for hard drives... So I've just done bench testing to see that I Can see the disk and do some I/O, but not much beyond that. And of late it's not feasible to redo that bench testing due to changes in the amount of junk I have on my bench.

Out of curiosity: is this a zpool import from a pool that was created on another system? Or was it working fine and then this started happening after some upgrade.

mpr and mps both share a common history, including the state tracking code, so it's not super surprising that this is being hit on both.
Comment 20 Dan Kotowski 2024-04-10 08:58:24 UTC
> Since this is on an ARM server, there may be something subtle there due to arm's weaker memory model than amd64 that's causing this.

I don't know if it's related or not, but PCIe GPUs under Linux DRM can experience weird artifacting and tearing as a result of some sort of memory issue. The fix from the firmware developer was to reorder operations in glibc memcpy.

> Subject: [PATCH] Aarch64: Make memcpy more compatible with device memory
> 
> For normal non-cacheable memory ACE supports 4x128 bit r/w WRAP
> transfers or 1x128 bit r/w INCR transfers.  By re-ordering the
> stp's in memcpy / memmove we can accomodate this better without
> impacting the existing code.
> 
> This fixes an issue seen on multiple Cortex-A72 SOCs when writing
> directly to a PCIe memmapped frame-buffer, which resulted in
> corruption.
https://gist.github.com/jnettlet/f6f8b49bb7c731255c46f541f875f436

Test for framebuffer memcpy bugs:
https://gist.github.com/jnettlet/80f8d09d01c0dc0ffc0122f36ed78de6

Unfortunately I lack the knowledge to know how to build a test util for FreeBSD, but I do wonder if the coherency issue on the bus is impacting my case as well?

Another user in the vendor's Discord channel recently stated that they've been experiencing issues with using 2x NVMe drives off of a SuperMicro bifurcated PCIe-to-NVMe adapter. And I've since been able to replicate the issues using a Linux mdadm mirror as well.

I have not seen panics when testing single drives, only when using pools of 2 or more.

Perhaps it only presents when there are multiple commands issued in parallel?

> is this a zpool import from a pool that was created on another system?

I have been able to reproduce with zpools from known-working systems and trying to create new ones as well.
Comment 21 Dan Kotowski 2024-10-10 14:28:55 UTC
An interesting update from Solidrun:

https://developer.arm.com/documentation/ddi0517/f/functional-description/constraints-and-limitations-of-use/axi3-and-axi4-support

> The MMU-500 supports the AXI3 and AXI4 protocols when the sysbardisable_<tbuname> input signal is tied HIGH. In such cases, the following AXI3 features are not supported:
> 
> Write data interleaving
> 
> Write data and write address ordering must be the same, otherwise data corruption can occur.

NXP pulls this high in their PBI code. Could write-interleaving enablement be a sysctl?
Comment 22 Dan Kotowski 2024-10-29 06:54:12 UTC
Another recent note from Solidrun's chief systems architect:

> The way that the Cortex-A72 cores do gathering is an undefinied behaviour in the PCIe spec, and since AXI / ACE / CHI don't know about the gathering memory property (it is part of the Arm core design) it causes inconsistencies.