Bug 277211

Summary: panic: Unhandled external data abort - handle_el1h_sync - --- exception, esr 0x96000410 - wait_fw_init - mlx5_load_one
Product: Base System Reporter: Dave Cottlehuber <dch>
Component: kernAssignee: John Baldwin <jhb>
Status: Closed FIXED    
Severity: Affects Only Me CC: glebius, jhb, tuexen
Priority: --- Keywords: crash
Version: 15.0-CURRENT   
Hardware: arm64   
OS: Any   
Attachments:
Description Flags
dmesg + panic as of 58df49801d9d
none
verbose dmesg with range_descr debugging
none
dmesg after D44132
none
fix.patch none

Description Dave Cottlehuber freebsd_committer freebsd_triage 2024-02-21 13:44:57 UTC
Created attachment 248657 [details]
dmesg + panic as of 58df49801d9d

panic from 58df49801d9d & jhb's "acpi: Defer reserving resources for ACPI devices" patch.
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2024-02-21 14:16:03 UTC
^Triage: notify jhb of possible problem with his change.
Comment 2 Michael Tuexen freebsd_committer freebsd_triage 2024-02-21 14:39:16 UTC
I also see it on my 32 core Ampere system:

CPU  0: APM eMAG 8180 r3p2 affinity:  0  0
                   Cache Type = <64 byte D-cacheline,64 byte I-cacheline,PIPT ICache,64 byte ERG,64 byte CWG>
 Instruction Set Attributes 0 = <CRC32,SHA2,SHA1,AES+PMULL>
 Instruction Set Attributes 1 = <>
 Instruction Set Attributes 2 = <>
Trying to mount root from ufs:/dev/ada0p2 [rw]...
         Processor Features 0 = <GIC,AdvSIMD,FP,EL3,EL2,EL1 32,EL0 32>
         Processor Features 1 = <>
      Memory Model Features 0 = <TGran4,TGran64,TGran16,SNSMem,BigEnd,16bit ASID,4TB PA>
      Memory Model Features 1 = <8bit VMID>
      Memory Model Features 2 = <32bit CCIDX,48bit VA>
             Debug Features 0 = <DoubleLock,2 CTX BKPTs,4 Watchpoints,6 Breakpoints,PMUv3,Debugv8>
             Debug Features 1 = <>
         Auxiliary Features 0 = <>
         Auxiliary Features 1 = <>
AArch32 Instruction Set Attributes 5 = <CRC32,SHA2,SHA1,AES+VMULL,SEVL>
AArch32 Media and VFP Features 0 = <FPRound,FPSqrt,FPDivide,DP VFPv3+v4,SP VFPv3+v4,AdvSIMD>
AArch32 Media and VFP Features 1 = <SIMDFMAC,FPHP DP Conv,SIMDHP SP Conv,SIMDSP,SIMDInt,SIMDLS,FPDNaN,FPFtZ>
CPU  1: APM eMAG 8180 r3p2 affinity:  0  1
CPU  2: APM eMAG 8180 r3p2 affinity:  1  0
CPU  3: APM eMAG 8180 r3p2 affinity:  1  1
CPU  4: APM eMAG 8180 r3p2 affinity:  2  0
CPU  5: APM eMAG 8180 r3p2 affinity:  2  1
CPU  6: APM eMAG 8180 r3p2 affinity:  3  0
CPU  7: APM eMAG 8180 r3p2 affinity:  3  1
CPU  8: APM eMAG 8180 r3p2 affinity:  4  0
CPU  9: APM eMAG 8180 r3p2 affinity:  4  1
CPU 10: APM eMAG 8180 r3p2 affinity:  5  0
CPU 11: APM eMAG 8180 r3p2 affinity:  5  1
CPU 12: APM eMAG 8180 r3p2 affinity:  6  0
CPU 13: APM eMAG 8180 r3p2 affinity:  6  1
CPU 14: APM eMAG 8180 r3p2 affinity:  7  0
CPU 15: APM eMAG 8180 r3p2 affinity:  7  1
CPU 16: APM eMAG 8180 r3p2 affinity:  8  0
CPU 17: APM eMAG 8180 r3p2 affinity:  8  1
CPU 18: APM eMAG 8180 r3p2 affinity:  9  0
CPU 19: APM eMAG 8180 r3p2 affinity:  9  1
CPU 20: APM eMAG 8180 r3p2 affinity: 10  0
CPU 21: APM eMAG 8180 r3p2 affinity: 10  1
CPU 22: APM eMAG 8180 r3p2 affinity: 11  0
CPU 23: APM eMAG 8180 r3p2 affinity: 11  1
CPU 24: APM eMAG 8180 r3p2 affinity: 12  0
CPU 25: APM eMAG 8180 r3p2 affinity: 12  1
CPU 26: APM eMAG 8180 r3p2 affinity: 13  0
CPU 27: APM eMAG 8180 r3p2 affinity: 13  1
CPU 28: APM eMAG 8180 r3p2 affinity: 14  0
CPU 29: APM eMAG 8180 r3p2 affinity: 14  1
CPU 30: APM eMAG 8180 r3p2 affinity: 15  0
CPU 31: APM eMAG 8180 r3p2 affinity: 15  1
Comment 3 Michael Tuexen freebsd_committer freebsd_triage 2024-02-22 16:31:26 UTC
A version, which is running fine is:
FreeBSD ampere32.nplab.de 15.0-CURRENT FreeBSD 15.0-CURRENT #76 main-n268036-d682c1eaa598-dirty: Sat Feb  3 23:32:31 CET 2024     root@ampere32.nplab.de:/usr/obj/usr/home/tuexen/freebsd-src/arm64.aarch64/sys/TCP arm64
Not sure which change introduced the problem. It there for about a week or so.
Comment 4 John Baldwin freebsd_committer freebsd_triage 2024-02-22 17:51:20 UTC
From what I could tell when I looked at this before for Dave, this appears to be an issue specific to linuxkpi used by the mlx5 driver.  In particular, it uses bus_translate_resource in its wrapper for pci_resource_start and I had asked Dave to boot with an additional patch to try to trace what is going on there to see if it is getting the wrong answer.  The mlx5 driver assumes that pci_resource_start gives a valid physical address it can pass to ioremap to create a memory mapping of the BAR.

The patch with the debugging trace is here: https://reviews.freebsd.org/P632
Comment 5 John Baldwin freebsd_committer freebsd_triage 2024-02-22 18:10:56 UTC
Ah, looks like the dmesg from Dave does actually include this patch as it has this line of output:

mlx5_core0: translate 0x14082000000 -> 0x24082000000

That looks correct, but unfortunately, we only display the ranges in bootverbose for FDT, not ACPI.  The patch below fixes the pcib driver to always log the ranges which would be useful to confirm the window:

diff --git a/sys/dev/pci/pci_host_generic.c b/sys/dev/pci/pci_host_generic.c
index 386b8411d29a..46b84ff3004b 100644
--- a/sys/dev/pci/pci_host_generic.c
+++ b/sys/dev/pci/pci_host_generic.c
@@ -83,6 +83,7 @@ pci_host_generic_core_attach(device_t dev)
 	uint64_t phys_base;
 	uint64_t pci_base;
 	uint64_t size;
+	const char *range_descr;
 	char buf[64];
 	int domain, error;
 	int flags, rid, tuple, type;
@@ -179,6 +180,7 @@ pci_host_generic_core_attach(device_t dev)
 		switch (FLAG_TYPE(sc->ranges[tuple].flags)) {
 		case FLAG_TYPE_PMEM:
 			sc->has_pmem = true;
+			range_descr = "prefetch";
 			flags = RF_PREFETCHABLE;
 			type = SYS_RES_MEMORY;
 			error = rman_manage_region(&sc->pmem_rman,
@@ -186,12 +188,14 @@ pci_host_generic_core_attach(device_t dev)
 			break;
 		case FLAG_TYPE_MEM:
 			flags = 0;
+			range_descr = "memory";
 			type = SYS_RES_MEMORY;
 			error = rman_manage_region(&sc->mem_rman,
 			   pci_base, pci_base + size - 1);
 			break;
 		case FLAG_TYPE_IO:
 			flags = 0;
+			range_descr = "I/O port";
 			type = SYS_RES_IOPORT;
 			error = rman_manage_region(&sc->io_rman,
 			   pci_base, pci_base + size - 1);
@@ -219,6 +223,10 @@ pci_host_generic_core_attach(device_t dev)
 			error = ENXIO;
 			goto err_rman_manage;
 		}
+		if (bootverbose)
+			device_printf(dev,
+			    "PCI addr: 0x%jx, CPU addr: 0x%jx, Size: 0x%jx, Type: %s\n",
+			    pci_base, phys_base, size, range_type);
 	}
 
 	return (0);

That said, it seems like the translation is correct given the prefetch window used for the pcib1 bridge between pcib0 and the mlx5 device:

pcib1: <PCI-PCI bridge> at device 0.0 on pci0
pcib1:   domain            0
pcib1:   secondary bus     1
pcib1:   subordinate bus   1
pcib1:   memory decode     0x30000000-0x301fffff
pcib1:   prefetched decode 0x14080000000-0x14083ffffff

And this allocation of mlx5's BAR:

        map[10]: type Prefetchable Memory, range 64, base rx14082000000, size 25, enabled
pcib1: allocated prefetch range (0x14082000000-0x14083ffffff) for rid 10 of pci0:1:0:0


It is odd for a register bar to be in a prefetch BAR.  It might be good to see a verbose dmesg from before to see how the bridge and and mlx5 BAR were configured before.
Comment 6 Dave Cottlehuber freebsd_committer freebsd_triage 2024-02-27 23:04:14 UTC
Created attachment 248803 [details]
verbose dmesg with range_descr debugging

thanks, attached dmesg after patch.
Comment 7 Dave Cottlehuber freebsd_committer freebsd_triage 2024-02-28 21:28:35 UTC
Created attachment 248817 [details]
dmesg after D44132

dmesg & panic after rebasing and applying D44132
Comment 8 John Baldwin freebsd_committer freebsd_triage 2024-03-04 20:56:43 UTC
Created attachment 248936 [details]
fix.patch

Please try this patch.  Looking at the dmesg, the address was translated incorrectly.  It matches this range which requires no translation:

pcib0: PCI addr: 0x10100000000, CPU addr: 0x10100000000, Size: 0x7f00000000, Type: memory

(PCI addr == CPU addr), but it was matching on the wrong range and translating the address as if it belonged to the first range:

pcib0: PCI addr: 0x0, CPU addr: 0x10010000000, Size: 0x10000, Type: I/O port

The code I changed in commit d79b6b8ec267 expected the end to be >= start, and the end value of 0 in the old code violated this assumption.
Comment 9 Michael Tuexen freebsd_committer freebsd_triage 2024-03-04 21:49:42 UTC
I tested that patch and can confirm that it fixes the issue on my ampere system.

Thanks for the patch!

Best regards
Michael
Comment 10 commit-hook freebsd_committer freebsd_triage 2024-03-05 04:54:15 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=332dbd3a2f08a887014a425d2532af93503588ce

commit 332dbd3a2f08a887014a425d2532af93503588ce
Author:     John Baldwin <jhb@FreeBSD.org>
AuthorDate: 2024-03-05 04:52:54 +0000
Commit:     John Baldwin <jhb@FreeBSD.org>
CommitDate: 2024-03-05 04:52:54 +0000

    pci_host_generic: Pass a valid end address in generic_pcie_translate_resource

    The generic_pcie_containing_range helper added in commit d79b6b8ec267
    assumed that the passed in (start, end) range used to locate the
    containing mapping range was a valid address range (with end >=
    start).  The previous version of
    generic_pcie_translate_resource_common only used the start address to
    locate a mapping range, so the end address of 0 did not matter, but an
    end of 0 now causes the first range to match and an incorrect
    translation for resources using a later range.

    PR:             277211
    Reported by:    dch, tuexen
    Reviewed by:    tuexen
    Fixes:          d79b6b8ec267 pci_host_generic: Don't rewrite resource start address for translation
    Differential Revision:  https://reviews.freebsd.org/D44205

 sys/dev/pci/pci_host_generic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Comment 11 Dave Cottlehuber freebsd_committer freebsd_triage 2024-03-08 06:45:21 UTC
works on my machine :-) thanks!!