Bug 260131 - RPI [CM4 /io-ref.-board] panics on pcie with NVMe connected
Summary: RPI [CM4 /io-ref.-board] panics on pcie with NVMe connected
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: arm (show other bugs)
Version: CURRENT
Hardware: arm64 Any
: --- Affects Some People
Assignee: freebsd-arm (Nobody)
URL:
Keywords: crash
Depends on:
Blocks:
 
Reported: 2021-11-30 04:51 UTC by Klaus Küchemann
Modified: 2024-04-21 23:14 UTC (History)
6 users (show)

See Also:


Attachments
linux kernel syslog pcie (8.56 KB, text/plain)
2021-12-11 03:23 UTC, Klaus Küchemann
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Klaus Küchemann 2021-11-30 04:51:19 UTC
well, while the story of "how to boot off of nvme on the RPI CM4 I/O-board" 
is quite too long at this place, I booted successfully from nvme and that heavily depends from rpi-foundation`s closed source proprietary software (and from fine-tuned configs) ( this dependence will never end although some devs like dreaming of it ;-). 
the good news is that only the pcie-driver`s magic numbers have to be adjusted to fix this bugzilla.
the 'best' debug information I could get for the first is :

--------------------------
pcib0: <BCM2838-compatible PCI-express controller> mem 0x7d500000-0x7d50930f irq 80,81 on simplebus2
pcib0: parsing FDT for ECAM0:
pcib0: 	PCI addr: 0xc0000000, CPU addr: 0x600000000, Size: 0x40000000
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: Bus is not cache-coherent
pcib0: hardware identifies as revision 0x304.
pcib0: note: reported link speed is 5.0 GT/s.
pci0: <OFW PCI bus> on pcib0
pci0: domain=0, physical bus=0
found->	vendor=0x14e4, dev=0x2711, revid=0x00
	domain=0, bus=0, slot=0, func=0
	class=06-04-00, hdrtype=0x01, mfdev=0
	cmdreg=0x0000, statreg=0x0010, cachelnsz=0 (dwords)
	lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns)
	intpin=a, irq=0
	powerspec 3  supports D0 D3  current D0
	secbus=0, subbus=0
pcib1: <PCI-PCI bridge> irq 91 at device 0.0 on pci0
pcib1: Lazy allocation of 1 bus at 1
pcib0: rman_reserve_resource: start=0xc0000000, end=0xc00fffff, count=0x100000
pcib0: Failed to translate resource 0-fffff type 3 for pcib1
pcib1: failed to allocate initial prefetch window: 0-0xfffff
pcib1:   domain            0
pcib1:   secondary bus     1
pcib1:   subordinate bus   1
pcib1:   memory decode     0xc0000000-0xc00fffff
pci1: <OFW PCI bus> on pcib1
pcib1: allocated bus range (1-1) for rid 0 of pci1
pci1: domain=0, physical bus=1
  x0: ffffa00000f23368
  x1:                8
  x2: ffff000000846817 (cam_status_table + d8cf)
  x3:              14a
  x4: ffffa001ffd95e00
  x5:              13c
  x6: ffff0000007ece68 (bcm_pcib_read_config + 0)
  x7:                0
  x8: ffff000000dd01f8 (thread0_st + 158)
  x9: ffff000000aded70 (lock_class_mtx_sleep + 0)
 x10:                1
 x11: ffff000000e819c0 (w_locklistdata + 43f78)
 x12:                1
 x13: ffff000000e819f4 (w_locklistdata + 43fac)
 x14:            10000
 x15:                1
 x16:                8
 x17: ffff00000103923c (initstack + 323c)
 x18: ffff000000f11a80 (pcpu0 + 0)
 x19: ffffa00000f23380
 x20:                0
 x21: ffff000000e819c0 (w_locklistdata + 43f78)
 x22: ffffa00000f23000
 x23:                0
 x24:                0
 x25:                1
 x26:             dead
 x27: ffffa00000e6d350
 x28: ffff000040466cd8 (ucom_cons_softc + 3f156718)
 x29: ffff000001039380 (initstack + 3380)
  sp: ffff000000b37160
  lr: ffff00000044d3e0 (__mtx_unlock_flags + 58)
 elr: ffff0000004dfb48 (witness_unlock + f8)
spsr:         600000c5
 far:                0
 esr:         bf000002
panic: Unhandled System Error
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x174
panic() at panic+0x44
do_serror() at do_serror+0x40
handle_serror() at handle_serror+0x94
--- system error, esr 0xbf000002
witness_unlock() at witness_unlock+0xf8
__mtx_unlock_flags() at __mtx_unlock_flags+0x54
bcm_pcib_read_config() at bcm_pcib_read_config+0x160
pci_read_device() at pci_read_device+0x84
pci_add_children() at pci_add_children+0x44
pci_attach() at pci_attach+0xe0
device_attach() at device_attach+0x400
device_probe_and_attach() at device_probe_and_attach+0x7c
bus_generic_attach() at bus_generic_attach+0x18
device_attach() at device_attach+0x400
device_probe_and_attach() at device_probe_and_attach+0x7c
bus_generic_attach() at bus_generic_attach+0x18
pci_attach() at pci_attach+0xe8
device_attach() at device_attach+0x400
device_probe_and_attach() at device_probe_and_attach+0x7c
bus_generic_attach() at bus_generic_attach+0x18
bcm_pcib_attach() at bcm_pcib_attach+0x87c
device_attach() at device_attach+0x400
device_probe_and_attach() at device_probe_and_attach+0x7c
bus_generic_new_pass() at bus_generic_new_pass+0xfc
bus_generic_new_pass() at bus_generic_new_pass+0xac
bus_generic_new_pass() at bus_generic_new_pass+0xac
bus_generic_new_pass() at bus_generic_new_pass+0xac
bus_set_pass() at bus_set_pass+0x4c
mi_startup() at mi_startup+0x12c
virtdone() at virtdone+0x78
KDB: enter: panic
[ thread pid 0 tid 100000 ]
Stopped at      kdb_enter+0x44: undefined       f901811f
db> bt
Tracing pid 0 tid 100000 td 0xffff000000dd00a0
db_trace_self() at db_trace_self
db_stack_trace() at db_stack_trace+0x11c
db_command() at db_command+0x368
db_command_loop() at db_command_loop+0x54
db_trap() at db_trap+0xf8
kdb_trap() at kdb_trap+0x1cc
handle_el1h_sync() at handle_el1h_sync+0x78
--- exception, esr 0xf2000000
kdb_enter() at kdb_enter+0x44
vpanic() at vpanic+0x1b0
panic() at panic+0x44
do_serror() at do_serror+0x40
handle_serror() at handle_serror+0x94
--- system error, esr 0xbf000002
witness_unlock() at witness_unlock+0xf8
__mtx_unlock_flags() at __mtx_unlock_flags+0x54
bcm_pcib_read_config() at bcm_pcib_read_config+0x160
pci_read_device() at pci_read_device+0x84
pci_add_children() at pci_add_children+0x44
pci_attach() at pci_attach+0xe0
device_attach() at device_attach+0x400
device_probe_and_attach() at device_probe_and_attach+0x7c
bus_generic_attach() at bus_generic_attach+0x18
device_attach() at device_attach+0x400
device_probe_and_attach() at device_probe_and_attach+0x7c
bus_generic_attach() at bus_generic_attach+0x18
pci_attach() at pci_attach+0xe8
device_attach() at device_attach+0x400
device_probe_and_attach() at device_probe_and_attach+0x7c
bus_generic_attach() at bus_generic_attach+0x18
bcm_pcib_attach() at bcm_pcib_attach+0x87c
device_attach() at device_attach+0x400
device_probe_and_attach() at device_probe_and_attach+0x7c
bus_generic_new_pass() at bus_generic_new_pass+0xfc
bus_generic_new_pass() at bus_generic_new_pass+0xac
bus_generic_new_pass() at bus_generic_new_pass+0xac
bus_generic_new_pass() at bus_generic_new_pass+0xac
bus_set_pass() at bus_set_pass+0x4c
mi_startup() at mi_startup+0x12c
virtdone() at virtdone+0x78
------------

this is with nvme module loaded .
excluding Rob`s pcie-driver from /...files.arm64 fixes boot panic to get able to boot from eMMC/uSD which can also be handled by adding devmatch_enable="NO" to /etc/rc.conf
/afaik the VL805 should not be involved no more to pcie(except pcie-usb-cards) since the I/o-board has an own USB- controller. 
since bcm2838_xhci.c contains code to not try to load VL805-firmware if not soldered to pcie: I didn't yet try to exclude bcm2838_xhci.c from kernel.  
tests were made on CM4108032/CM4IO Board which I received last week.
Comment 1 Klaus Küchemann 2021-11-30 05:09:15 UTC
forgot to say:
I also reverted in one test to the 1st version of bcm2838_pci.c which didn't contain 
the dma-constraints which were later added because of a hardware-bug on the RPI4b which should no more exist on the CM4(as I read somewhere).
it didn't fix the issue.
Comment 2 Klaus Küchemann 2021-11-30 06:07:12 UTC
well, it seems nothing to have to do with the dtb/dts since I didn't find any abnormalities there with the ranges.
So it seems that it again depends on single stepping through the linux kernel to get the values out of the closed source eeprom&firmware.
if you RPI4 driver experts do not own an CM4 or you do not have time:
I would be willing to do that(JTAG is setup), but 
I would need detailed instructions(what would cost your time) since I'm not 1 of those experts in that area.
O.K., enough CM4 for now :-)
Comment 3 Klaus Küchemann 2021-12-02 04:49:43 UTC
https://github.com/raspberrypi/linux/issues/4117

https://forums.raspberrypi.com/viewtopic.php?t=302402
<<...some funny mapping of 32bit address ranges with addresses over 4GB (which is where the PCIe window is by default).>>
Comment 4 Efe 2021-12-04 20:07:13 UTC
I have been trying to to fix this for a few weeks already, but with no luck so far. Have to admit that you tried way more than I did :-)

Nevertheless, I can confirm the aforementioned kernel panics trying to boot from eMMC/mSD. 

If you make any progress or have any suggestions, I am at your disposal for testing. Maybe it's worth mentioning that I own various PCIe cards including a PCIe USB 3.0 card with VL805 chip.
Comment 5 Klaus Küchemann 2021-12-05 02:24:12 UTC
(In reply to Efe from comment #4)
<<I own various PCIe cards including a PCIe USB 3.0 card with VL805 chip>>
-
that's interesting for future tests, 
but afaik  you would need to flash the pcie-card`s VL805 rom 
with e.g a tool called 'flashrom' into an RPI-VL805 which you can grab 
on theirs firmware-git. .....

but step by step(unfortunately many steps and too much for this bugzilla,... 
so only the most important for now :
the current bcm2838_pci.c "thinks" that we are on the SOC of an  RPi4b ,
but we don ' t :
--
RPi4, SoC version BCM2711C0(this is the CM4!) 8GB RAM: 
[0x0/ 0x200000000](phys/cpu address space)
[0x200000000 0x400000000]pcie bus address space
[0xc0000000 0x100000000] Legacy peripheral address space

--
RPi4, SoC version BCM2711C0(CM4) with 4GB(!) RAM
[0x0 0x100000000](phys/cpu address space)
[0x100000000 0x200000000]pcie bus address space
[0xc0000000 0x100000000] Legacy peripheral address space
--
RPi4, SoC version BCM2711B08(this is the RPI4b!)  8GB RAM (bus can only access the
  lower 3GB RAM)
[0x0 0x200000000](phys/cpu address space)
 [0x00000000 0xC0000000]pcie bus address space
[0xc0000000 0x100000000] Legacy peripheral address space
--

you will find the  ...0xC0000000]pcie bus address space of the Rpi4b in the first hardcoded version of bcm2838_pci.c which was then later changed to even lower the space to 1GB. 

on the "Tux side of live" they solved this problem by a "dev_phys_to_bus() prefetch"  patch series to prevent using hardcoded values... afaik. 

my current estimation is that every other attempt from eeprom to start4.elf/fixup4.dat 
to DTB/DTS to config.txt to u-boot(!) will fail until we know how to handle the dma window.
(I didn`t test with manipulating the DTS).

So it seems that we're on our own to understand the code extensively and find a solution.
Or another realistic attempt : perhaps you can book a flight from Berlin to London with your CM4 and ask Dr. Crow.... for a tea time in his house :-) 
perhaps it will be only a one-liner hack for him but when I look into the tux-Code I don't think so...
by the way: mine is the CM4108032(8GB RAM) and according to the bus space values it would be interesting if yours would be e.g. a one with 4GB of RAM.

Regards
K.
Comment 6 Efe 2021-12-05 18:46:53 UTC
(In reply to Klaus Küchemann from comment #5)

Indeed, testing the VL805 card would be out of scope for now. We should rather focus on the essentials and figure out how handle the dma window which seems to be trickier than I expected :-)

<<I didn't yet try to exclude bcm2838_xhci.c from kernel.>>

I actually tried it, but with no avail. Same kernel panics.

<<Or another realistic attempt : perhaps you can book a flight from Berlin to London with your CM4 and ask Dr. Crow.... for a tea time in his house :-) >>

Actually, I had a trip planned to London for next week. However, due to some unforeseen circumstances I had to postpone it – at least for now. Would have been an interesting adventure though :-D

<<on the "Tux side of live" they solved this problem by a "dev_phys_to_bus() prefetch"  patch series to prevent using hardcoded values... afaik. >>

I have also come across the mailing list. Looks like they have put a lot of effort in getting it to work. Felt a little bit alienated with some parts (I am no expert), but it was an interesting read to see what they have done. 

To your question about which device I own – it's the CM4004000 (4GB RAM, no eMMC, no WIFI & BT) + CM4IO Board. Everything I tried so far was done on that device.

BR,
Efe
Comment 7 Efe 2021-12-10 23:27:39 UTC
I have been going through the implementation over the past week and tried out a few things, but with no luck so far. Modifying DTBs didn't help much either – maybe you will have better luck than I do :-)

Have you by any chance had a look at the OpenBSD implementation? There have been recent pushes regarding the PCIe implementation. It was confirmed in the mailing lists that PCIe on CM4 works in -current (I haven't tried it out though). 

I am wondering if trying to port it would be a wise option as I started running out of ideas :D
Comment 8 Klaus Küchemann 2021-12-11 00:07:09 UTC
(In reply to Efe from comment #7)
<<...Modifying DTBs...>
well, there are changes in bcm2711.dtsi , I think it's  the hardware changes we have to respect in bcm2838_pci.c (the old constraints are invalid for the CM4) .

<<Have you by any chance had a look at the OpenBSD implementation?>>
OpenBSD`s pcie-driver was made for rpi4-uefi.dev, which isn't supported by FreeBSD.
I don't know if OpenBSD meanwhile supports FDT on the RPI4(NetBSD does but no clue if they support pcie fdt).

I can't say how long it will last until we have a patch available for this issue..
Comment 9 Klaus Küchemann 2021-12-11 03:23:15 UTC
Created attachment 230036 [details]
linux kernel syslog pcie

well, I grabbed the linux syslog(tux_pcie_syslog.txt) from this same device 
for Rob(no clue if he's reading here),
perhaps he or another magic number specialist can give a first "translation" what exactly happens in comparison  to the fbsd-backtrace ...
Comment 10 Klaus Küchemann 2021-12-11 03:57:15 UTC
@Efe: reading /drivers/pci/controller/pcie-brcmstb.c from linux I have an idea why it fails on my 8GB with FreeBSD.. 
but I don't know why it also fails on your 4GB RAM..
I don't know if we could get any useful info from the following , but:
perhaps you could try the first version of Rob`s driver wich didn't have the 1GB window inside and print a dmesg ( with boot -v from the loader prompt) and backtrace 
if it panics.
... while I didn't exactly remember what I did with config.txt/u-boot/dtb etc.to get the dmesg (I had forced the panic by hotplugging nvme to pcie while the driver was already loaded)...
if you don't want to lose time working on this issue  because nobody knows if it helps, we should wait for Rob to enlighten us :-)  ...
Comment 11 Robert Crowston 2021-12-11 17:33:51 UTC
The stack trace indicates two separate errors to me.

I don’t have the code in front of me, so this is just some guesswork. 

1/ The prefetch window allocation fails — to my recollection the Pi4 didn’t have prefetch, so perhaps we never took this path before, or perhaps it’s not even a problem that the allocation fails. There is a flag or option somewhere to turn it off, I think.

2/ Witness is exploding when we do a mutex unlock inside the config_read. Perhaps somehow the mutex isn’t initialized yet? Does it still explode if you disable witness? What if you comment out the mutex lock/unlock? (The mutex is required because we cannot retrieve the config from the pci controller in a single operation, and if multiple threads enter this function at once, the behaviour will be undefined. However this should be a relatively rare race condition and not encountered during the single threaded boot.)
Comment 12 Klaus Küchemann 2021-12-12 05:32:29 UTC
(In reply to Robert Crowston from comment #11)

Hi Rob, 'hope you're fine...
------------------------------------------------------------------------
--- a/sys/arm/broadcom/bcm2835/bcm2838_pci.c
+++ b/sys/arm/broadcom/bcm2835/bcm2838_pci.c
@@ -327,7 +327,7 @@ bcm_pcib_read_config(device_t dev, u_int bus, u_int slot, u_int func, u_int reg,
        if (!bcm_pcib_is_valid_quad(sc, bus, slot, func, reg))
                return (~0U);
 
-       mtx_lock(&sc->config_mtx);
+       //mtx_lock(&sc->config_mtx);
        offset = bcm_get_offset_and_prepare_config(sc, bus, slot, func, reg);
 
        t = sc->base.base.bst;
@@ -348,7 +348,7 @@ bcm_pcib_read_config(device_t dev, u_int bus, u_int slot, u_int func, u_int reg,
                break;
        }
 
-       mtx_unlock(&sc->config_mtx);
+       //mtx_unlock(&sc->config_mtx);
        return (data);
 }
 
@@ -365,7 +365,7 @@ bcm_pcib_write_config(device_t dev, u_int bus, u_int slot,
        if (!bcm_pcib_is_valid_quad(sc, bus, slot, func, reg))
                return;
 
-       mtx_lock(&sc->config_mtx);
+       //mtx_lock(&sc->config_mtx);
        offset = bcm_get_offset_and_prepare_config(sc, bus, slot, func, reg);
 
        t = sc->base.base.bst;
@@ -385,7 +385,7 @@ bcm_pcib_write_config(device_t dev, u_int bus, u_int slot,
                break;
        }
 
-       mtx_unlock(&sc->config_mtx);
+       //mtx_unlock(&sc->config_mtx);
 }
 
 static void
@@ -440,7 +440,7 @@ bcm_pcib_alloc_msi(device_t dev, device_t child, int count, int maxcount,
        int first_int, i;
 
        sc = device_get_softc(dev);
-       mtx_lock(&sc->msi_mtx);
+       //mtx_lock(&sc->msi_mtx);
 
        /* Find a continguous region of free message-signalled interrupts. */
        for (first_int = 0; first_int + count < NUM_MSI; ) {
@@ -454,7 +454,7 @@ bcm_pcib_alloc_msi(device_t dev, device_t child, int count, int maxcount,
        }
 
        /* No appropriate region available. */
-       mtx_unlock(&sc->msi_mtx);
+       //mtx_unlock(&sc->msi_mtx);
        device_printf(dev, "warning: failed to allocate %d MSI messages.\n",
            count);
        return (ENXIO);
@@ -496,14 +496,14 @@ bcm_pcib_release_msi(device_t dev, device_t child, int count,
        int i;
 
        sc = device_get_softc(dev);
-       mtx_lock(&sc->msi_mtx);
+       //mtx_lock(&sc->msi_mtx);
 
        for (i = 0; i < count; i++) {
                msi_isrc = (struct bcm_pcib_irqsrc *) isrc[i];
                msi_isrc->allocated = false;
        }
 
-       mtx_unlock(&sc->msi_mtx);
+       //mtx_unlock(&sc->msi_mtx);
        return (0);
 }
-------------------------------------------------------------------------
 
pxe-boot/ pcie/nvme plugged in/cold-boot :*/ :

FreeBSD  14.0-CURRENT FreeBSD 14.0-CURRENT #4 main-n249506-60e5f699211a: Sun Dec 12 05:25:06 CET 2021     root@xxx.de:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC-NODEBUG  arm64
...

pcib0: <BCM2838-compatible PCI-express controller> mem 0x7d500000-0x7d50930f irq 70,71 on simplebus2
pcib0: parsing FDT for ECAM0:
pcib0: 	PCI addr: 0xf8000000, CPU addr: 0x600000000, Size: 0x4000000
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: 	PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: Bus is not cache-coherent
pcib0: hardware identifies as revision 0x304.
pcib0: note: reported link speed is 5.0 GT/s.
pci0: <OFW PCI bus> on pcib0
pci0: domain=0, physical bus=0
found->	vendor=0x14e4, dev=0x2711, revid=0x00
	domain=0, bus=0, slot=0, func=0
	class=06-04-00, hdrtype=0x01, mfdev=0
	cmdreg=0x0000, statreg=0x0010, cachelnsz=0 (dwords)
	lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns)
	intpin=a, irq=0
	powerspec 3  supports D0 D3  current D0
	secbus=1, subbus=1
pcib1: <PCI-PCI bridge> irq 81 at device 0.0 on pci0
pcib0: rman_reserve_resource: start=0xf8000000, end=0xf80fffff, count=0x100000
pcib1:   domain            0
pcib1:   secondary bus     1
pcib1:   subordinate bus   1
pcib1:   memory decode     0xf8000000-0xf80fffff
pci1: <PCI bus> on pcib1
pcib1: allocated bus range (1-1) for rid 0 of pci1
pci1: domain=0, physical bus=1
  x0:             dead
  x1: ffffa00000de44c8
  x2:             8000
  x3:           100000
  x4:                0
  x5:                2
  x6: ffff0000008963d4 (bcm_pcib_read_config + 0)
  x7:                0
  x8: ffffa00000de4000
  x9:               99
 x10: ffff000000b0fe58 (pcib_read_config_desc + 0)
 x11:                0
 x12:                0
 x13: ff0100000000011f
 x14:                0
 x15:                a
 x16: ffff00000100932d (initstack + 332d)
 x17:                0
 x18: ffff000000ee2080 (pcpu0 + 0)
 x19:                0
 x20:                0
 x21:                1
 x22:                0
 x23: ffffa0001a83b800
 x24: ffffa0001a83b700
 x25:             dead
 x26: ffff000000b0fe58 (pcib_read_config_desc + 0)
 x27: ffffa0001a82a010
 x28: ffff000000c6f990 (__set_sysinit_set_sym_M_NICVF_init_sys_init + 0)
 x29: ffff000001009530 (initstack + 3530)
  sp: ffff000000ba8138
  lr: ffff000000232238 (pci_read_device + 88)
 elr: ffff000000232270 (pci_read_device + c0)
spsr:               c5
 far:                0
 esr:         bf000002
panic: Unhandled System Error
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x174
panic() at panic+0x44
do_serror() at do_serror+0x40
handle_serror() at handle_serror+0x94
--- system error, esr 0xbf000002
pci_read_device() at pci_read_device+0xc0
pci_add_children() at pci_add_children+0x44
pci_attach() at pci_attach+0xd8
device_attach() at device_attach+0x3f8
bus_generic_attach() at bus_generic_attach+0x4c
device_attach() at device_attach+0x3f8
bus_generic_attach() at bus_generic_attach+0x4c
pci_attach() at pci_attach+0xe0
device_attach() at device_attach+0x3f8
bus_generic_attach() at bus_generic_attach+0x4c
device_attach() at device_attach+0x3f8
bus_generic_new_pass() at bus_generic_new_pass+0x120
bus_generic_new_pass() at bus_generic_new_pass+0xb0
bus_generic_new_pass() at bus_generic_new_pass+0xb0
bus_generic_new_pass() at bus_generic_new_pass+0xb0
root_bus_configure() at root_bus_configure+0x40
mi_startup() at mi_startup+0x21c
virtdone() at virtdone+0x78
KDB: enter: panic
[ thread pid 0 tid 100000 ]
Stopped at      kdb_enter+0x44: undefined       f901411f
----------------------
 didn't look much better before (without the mtx-outcommentation) :
...

pcib0: <BCM2838-compatible PCI-express controller> mem 0x7d500000-0x7d50930f irq 70,71 on simplebus2
pcib0: hardware identifies as revision 0x304.
pci0: <OFW PCI bus> on pcib0
pcib1: <PCI-PCI bridge> irq 81 at device 0.0 on pci0
pci1: <PCI bus> on pcib1
  x0:             dead
  x1: ffff000000e41020 (thread0_st + 0)
  x2:             8000
  x3:           100000
  x4:                0
  x5:                2
  x6: ffff0000008963d4 (bcm_pcib_read_config + 0)
  x7:                0
  x8: ffffa00000f56788
  x9:                0
 x10: ffffa00000f56788
 x11:                1
 x12:                0
 x13: ff00000000ff0100
 x14:                0
 x15:                0
 x16:                0
 x17:                0
 x18: ffff000000ee2080 (pcpu0 + 0)
 x19: ffffa00000f56400
 x20:                2
 x21:             8000
 x22:                0
 x23:                0
 x24:                1
 x25: ffff000000be6000 (vop_spare1_desc + 10)
 x26: ffff000000be6000 (vop_spare1_desc + 10)
 x27: ffffa0001a82a010
 x28: ffff000000c6fd10 (__set_sysinit_set_sym_M_NICVF_init_sys_init + 0)
 x29: ffff0000010094e0 (initstack + 34e0)
  sp: ffff000000ba84b8
  lr: ffff000000896554 (bcm_pcib_read_config + 180)
 elr: ffff0000008965a0 (bcm_pcib_read_config + 1cc)
spsr:         600000c5
 far:                0
 esr:         bf000002
panic: Unhandled System Error
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x174
panic() at panic+0x44
do_serror() at do_serror+0x40
handle_serror() at handle_serror+0x94
--- system error, esr 0xbf000002
bcm_pcib_read_config() at bcm_pcib_read_config+0x1cc
pci_read_device() at pci_read_device+0x84
pci_add_children() at pci_add_children+0x44
pci_attach() at pci_attach+0xd8
device_attach() at device_attach+0x3f8
bus_generic_attach() at bus_generic_attach+0x4c
device_attach() at device_attach+0x3f8
bus_generic_attach() at bus_generic_attach+0x4c
pci_attach() at pci_attach+0xe0
device_attach() at device_attach+0x3f8
bus_generic_attach() at bus_generic_attach+0x4c
device_attach() at device_attach+0x3f8
bus_generic_new_pass() at bus_generic_new_pass+0x120
bus_generic_new_pass() at bus_generic_new_pass+0xb0
bus_generic_new_pass() at bus_generic_new_pass+0xb0
bus_generic_new_pass() at bus_generic_new_pass+0xb0
root_bus_configure() at root_bus_configure+0x40
mi_startup() at mi_startup+0x21c
virtdone() at virtdone+0x78
KDB: enter: panic
[ thread pid 0 tid 100000 ]
Stopped at      kdb_enter+0x44: undefined       f901411f
-----

well, Rob,
I have an idea stolen from Efe :-) :
Efe from comment #7)
<<Have you by any chance had a look at the OpenBSD implementation?>>

https://github.com/openbsd/src/commits/a8a0f5312628f812dc675fdd91fd18a6ca91ae77/sys/dev/fdt/bcm2711_pcie.c

... seems OpenBSD supports fdt now and have they have done some things with the (new CM4???-)- ranges...

--- the dts(not yet in fbsd-tree:---
https://github.com/raspberrypi/linux/blob/rpi-5.10.y/arch/arm/boot/dts/bcm2711-rpi-cm4.dts
https://github.com/raspberrypi/linux/blob/rpi-5.10.y/arch/arm/boot/dts/bcm2711.dtsi
Comment 13 Klaus Küchemann 2021-12-12 08:57:57 UTC
(In reply to Klaus Küchemann from comment #9)
<<...a look at the OpenBSD implementation...>

O.K., I took a look and OpenBSD`s driver also heavily ends up in panic 
when pcie/nvme is plugged in from cold boot with u-boot(no UEFI involved):
---
panic: do_el0_error
Stopped at      panic+0x160:    cmp     w21, #0x0
    TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
* 45193  29174      0           0          0    0  init
 510782  72391      0     0x14000      0x200    1  zerothread
 420471      1      0         0x2          0    2  init
db_enter() at panic+0x15c
panic() at do_el0_error+0x10
imxpwm_match() at handle_el0_error+0x74
handle_el0_error() at 0x5c9cadea8
address 0x7ffffd56f8 is invalid
https://www.openbsd.org/ddb.html describes the minimum info required in bug
reports.  Insufficient info makes it difficult to find and fix bugs.
ddb{0}> 
----

so this is again the chance to be the first BSD with a functional 
fdt-pcie-driver (this time for the cm4) :-)
Comment 14 Efe 2021-12-12 13:30:05 UTC
@Klaus: v1 of Rob's implementation didn't help either. Still getting kernel panics and the information there is far less useful than the ones I am getting in the current version.

Regarding the missing cm4 dts file. Are those actually used or just there for reference? To my knowledge DTBs are loaded by the rpi bootloader and then handed to the kernel via das u-boot.
I mean, I can see that in ../std.broadcom we have:

# DTBs
makeoptions	MODULES_EXTRA+="dtb/rpi"

But I don't see any makeoption that has:

makeoptions    FDT_DTS_FILE=XYZ.dts

We can of course try to give it a shot. Maybe load the dtb in the loader prompt and then boot? (see https://wiki.freebsd.org/FlattenedDeviceTree)


@Rob: Glad to hear from you Rob! I hope you're doing fine :-)

I just initialized mutex with "NO_WITNESS" keeping mutex locks/unlocks. Didn't solve the problem. Then proceeded like Klaus and commented out locks/unlocks (with the default mutex init). Same issues :/
Comment 15 Klaus Küchemann 2021-12-12 18:02:59 UTC
(In reply to Efe from comment #14)

<<Regarding the missing cm4 dts file. Are those actually used or just there for reference?>>
at runtime only used for compatibility check I think.
when you see the boot-log message "no dts Blabla" it's just a warning for the maintainers so they can tell you: unsupported(not from upstream) ;-)
the first entry where the dtb is loaded is from the eeprom, you can check that 
by "uart_2ndstage=1; ", later with fdt print from the loader. I think we both currently use the one from linux, not the latest from rpi/org, because the latest from rpi/git doesn't boot with fbsd...

<<v1 of Rob's implementation didn't help either. Still getting kernel panics and the information there is far less useful than the ones I am getting in the current version.>>
v1 of Rob`s driver has an older module name in the last line, so it won't load 
if you copyPasted the old last line. it has to be the newer:
DRIVER_MODULE(bcm_pcib, simplebus, bcm_pcib_driver, bcm_pcib_devclass, 0, 0); 
meanwhile I also tested v1 by no avail, it prints the exact same panic as v2 .

I thought of the issue could come from nvme driver but since you have tested other pcie-cards I guess it's really the ranges/dma offset/ or whatever what is invalid for the cm4.
I also made an own u-boot compilation with the nvme(scan)-command compiled in and even u-boot can't assign the BARs correctly .
that lets me assume that JTAG is the last resort...???
Comment 16 Efe 2021-12-13 22:24:38 UTC
(In reply to Klaus Küchemann from comment #15)

In fact, I tried three different pcie cards (nvme adapter with nvme drive, usb card and network card). Had no luck with either of them. Guess it goes down to <<ranges/dma offset/ or whatever is invalid for the cm4>> as you said.

I more or less have an idea of how Rob's implementation works and also went through major parts of the code for a sanity check (kudos for the boolean simplifications). There is few things in the tux code that I didn't see here and there in the code, so can't really tell whether those things are required or can be left out. Nevertheless, I will try to adjust the code now and maybe add a few things and see if it helps or not. If it doesn't, I guess we really should consider JTAG.

I am also curious about how to get this?

#define REG_MSI_EOI				0x4060
Comment 17 Efe 2021-12-16 14:50:31 UTC
Here is an update of what I tried so far and some thoughts/observations/summary:

[1] Cold booting with a PCIe card attached results in an immediate kernel panic. By immediate I mean straight after the DTB is loaded. I didn't have any luck debugging/backtracing that issue. It doesn't seem like it's related to bcm2838 pcie. Maybe Klaus has an explanation for that?

[2] From what I understand, Klaus' logs where obtained by hot plugging his nvme adapter. One can achieve similar logs (at least to see some magic numbers) without any attached device by simply skipping the "waiting for controller" part. Doesn't help much, but at least something :)

[3] With a PCIe card attached, U-Boot was giving me some messages about failed autoconfig bar X (confirmed by Klaus). I compiled the most recent RC U-Boot and tried to use the dtb, bootcode etc. from stable (#a60a479). Cold booting with PCIe card attached doesn't work (immediate panic as before). Without any device attached, the booting proceeds, but due to changes in the device tree, the current SDHCI driver isn't compatible with recent DTs (Klaus also confirmed incompatibility) and initializing the latter results in a kernel panic. Unfortunately, this is happening so early that there is no logs about PCIe. The new cm4 dtb has "/scb/pcie@7d500000/pci@1/usb@1\n" among other changes. Removing the latter in the loader doesn't seem to change anything either.

[4] I tried to obtain values from brcmstb pcie tux and u-boot pcie, hardcoded them into Rob's implementation (some values for my device where slightly different). Didn't change anything. 

Note that I am using the latest stable firmware for my device (from November 2021). All in all, whatever I tried didn't help and a magic numbers expert definitely knows more about it. For now, I will just hope that someone with domain knowledge will tackle this issue, but knowing that upstream changes in DTs keep breaking other parts of the code, I am not sure if anyone is willing to go through so much trouble.

Nevertheless, if you guys come up with something and need some testing, I am available :-)

BR,
Efe
Comment 18 Klaus Küchemann 2021-12-17 09:44:28 UTC
(In reply to Efe from comment #17)
thanks for testing... well, there`s a lot to be said about the environment(dtb/u-boot/eeprom/firmware etc.) but this time it seems to be sure that the pcie-driver will need a new memory computation. Everything else after a patch...
Comment 19 Graham Perrin 2023-10-15 13:24:07 UTC
NB <https://www.freebsd.org/releases/14.0R/> the release note about NVMe.
Comment 20 HP van Braam 2024-04-14 15:32:22 UTC
I think I have a potential fix for this issue, at least it looks very similar to a problem I've fixed. Note that the following needs to be cleaned up, and magic numbers need to be named, but it'd be useful if you could try to apply this (tested on 14-RELEASE and 15-CURRENT) and see if nvme now just works (tm).

diff --git a/sys/arm/broadcom/bcm2835/bcm2838_pci.c
b/sys/arm/broadcom/bcm2835/bcm2838_pci.c
index 2dfd6744127a..e2ecfb861697 100644
--- a/sys/arm/broadcom/bcm2835/bcm2838_pci.c
+++ b/sys/arm/broadcom/bcm2835/bcm2838_pci.c
@@ -74,6 +74,7 @@
 #define REG_CPU_WINDOW_LOW			0x4070
 #define REG_CPU_WINDOW_START_HIGH		0x4080
 #define REG_CPU_WINDOW_END_HIGH			0x4084
+#define REG_PCIE_CAP                            0x00ac
 
 #define REG_MSI_ADDR_LOW			0x4044
 #define REG_MSI_ADDR_HIGH			0x4048
@@ -730,6 +731,26 @@ bcm_pcib_attach(device_t dev)
 	if (error != 0)
 		return (error);
 
+	DELAY(100);
+
+        uint32_t tmp = bcm_pcib_read_reg(sc, 0x4204);
+        tmp |= 0x2;
+        tmp |= 0x00200000;
+        bcm_pcib_set_reg(sc, 0x4204, tmp);
+
+	DELAY(100);
+
+	// Set PCIe generation to 2, any higher and the controller
fails
+	uint16_t lnkctl2 = bcm_pcib_read_reg(sc,REG_PCIE_CAP +
PCIER_LINK_CTL2);
+	uint32_t lnkcap = bcm_pcib_read_reg(sc, REG_PCIE_CAP +
PCIER_LINK_CAP);
+
+	lnkcap = (lnkcap & ~0x0000000f) | 2;
+	bcm_pcib_set_reg(sc, REG_PCIE_CAP + PCIER_LINK_CAP, lnkcap);
+	lnkctl2 = (lnkctl2 & ~0xf) | 2;
+	bcm_pcib_set_reg(sc, REG_PCIE_CAP + PCIER_LINK_CTL2, lnkctl2);
+
+	DELAY(100);
+
 	/* Done. */
 	device_add_child(dev, "pci", -1);
 	return (bus_generic_attach(dev));
Comment 21 HP van Braam 2024-04-16 19:01:59 UTC
I made a better version of the patch, turns out I made several wrong assumptions, this is probably the correct way to do this.

All this really does now is enable L1SS, tuns out that by default the controller already limits itself to gen2.

diff --git a/sys/arm/broadcom/bcm2835/bcm2838_pci.c b/sys/arm/broadcom/bcm2835/bcm2838_pci.c
index 2dfd6744127a..8429e85f97ac 100644
--- a/sys/arm/broadcom/bcm2835/bcm2838_pci.c
+++ b/sys/arm/broadcom/bcm2835/bcm2838_pci.c
@@ -61,7 +61,7 @@
 #define REG_BRIDGE_CTRL				0x9210
 #define BRIDGE_DISABLE_FLAG	0x1
 #define BRIDGE_RESET_FLAG	0x2
-#define REG_BRIDGE_SERDES_MODE			0x4204
+#define REG_PCIE_HARD_DEBUG			0x4204
 #define REG_DMA_CONFIG				0x4008
 #define REG_DMA_WINDOW_LOW			0x4034
 #define REG_DMA_WINDOW_HIGH			0x4038
@@ -74,6 +74,7 @@
 #define REG_CPU_WINDOW_LOW			0x4070
 #define REG_CPU_WINDOW_START_HIGH		0x4080
 #define REG_CPU_WINDOW_END_HIGH			0x4084
+#define REG_PCIE_CAP                            0x00ac
 
 #define REG_MSI_ADDR_LOW			0x4044
 #define REG_MSI_ADDR_HIGH			0x4048
@@ -87,6 +88,9 @@
 #define REG_EP_CONFIG_CHOICE			0x9000
 #define REG_EP_CONFIG_DATA			0x8000
 
+#define L1SS_ENABLE                             0x00200000
+#define CLKREQ_ENABLE                           0x2
+
 /*
  * The system memory controller can address up to 16 GiB of physical memory
  * (although at time of writing the largest memory size available for purchase
@@ -191,7 +195,7 @@ bcm_pcib_reset_controller(struct bcm_pcib_softc *sc)
 
 	DELAY(100);
 
-	bcm_pcib_set_reg(sc, REG_BRIDGE_SERDES_MODE, 0);
+	bcm_pcib_set_reg(sc, REG_PCIE_HARD_DEBUG, 0);
 
 	DELAY(100);
 }
@@ -648,6 +652,7 @@ bcm_pcib_attach(device_t dev)
 
 	mtx_init(&sc->config_mtx, "bcm_pcib: config_mtx", NULL, MTX_DEF);
 
+
 	bcm_pcib_reset_controller(sc);
 
 	hardware_rev = bcm_pcib_read_reg(sc, REG_CONTROLLER_HW_REV) & 0xffff;
@@ -720,7 +725,14 @@ bcm_pcib_attach(device_t dev)
 	bcm_pcib_set_reg(sc, PCI_ID_VAL3,
 	    PCIC_BRIDGE << CLASS_SHIFT | PCIS_BRIDGE_PCI << SUBCLASS_SHIFT);
 
-	bcm_pcib_set_reg(sc, REG_BRIDGE_SERDES_MODE, 0x2);
+        uint32_t tmp = bcm_pcib_read_reg(sc, REG_PCIE_HARD_DEBUG);
+	device_printf(sc->dev, "SERDES_MODE: 0x%08X\n", tmp);
+        tmp |= CLKREQ_ENABLE;
+	bcm_pcib_set_reg(sc, REG_PCIE_HARD_DEBUG, tmp);
+	DELAY(100);
+
+        tmp |= L1SS_ENABLE;
+	bcm_pcib_set_reg(sc, REG_PCIE_HARD_DEBUG, tmp);
 	DELAY(100);
 
 	bcm_pcib_relocate_bridge_window(dev);
@@ -730,6 +742,8 @@ bcm_pcib_attach(device_t dev)
 	if (error != 0)
 		return (error);
 
+	DELAY(100);
+
 	/* Done. */
 	device_add_child(dev, "pci", -1);
 	return (bus_generic_attach(dev));
Comment 22 Klaus Küchemann 2024-04-16 20:19:22 UTC
(In reply to HP van Braam from comment #21)

Looks good, thank you.
I haven`t worked with FreeBSD the last few years and almost forgot there`s a 
Cm4 laying around in the dust here ;-)
You can give this in review ( reviews.freebsd.org) 
or if you prefer, to github PR.
I would then later be willing to setup and test again.

Regards
K.
Comment 23 HP van Braam 2024-04-16 21:28:53 UTC
I'll make a GH pr, I've done some more cleaning up, actually respecting the OF parameter brcm,enable-l1ss.

If this solves your problem, and it seems to solve my problem, I'm pretty sure it works for other things too!
Comment 24 HP van Braam 2024-04-16 23:22:41 UTC
(In reply to Klaus Küchemann from comment #22)

I made a GH pull request here: https://github.com/freebsd/freebsd-src/pull/1179
Comment 25 Klaus Küchemann 2024-04-18 07:59:17 UTC
(In reply to HP van Braam from comment #24)
thx, 👍
I think this bug report can be closed as fixed after merging #1179 , 
`ve additionally added #1182
Comment 26 commit-hook freebsd_committer freebsd_triage 2024-04-21 22:50:55 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=10e0c34bf842885b4bd78adbbdbd7fb00f133cb5

commit 10e0c34bf842885b4bd78adbbdbd7fb00f133cb5
Author:     HP van Braam <hp@tmm.cx>
AuthorDate: 2024-04-16 23:01:20 +0000
Commit:     Ed Maste <emaste@FreeBSD.org>
CommitDate: 2024-04-21 22:34:05 +0000

    Enable L1SS handling on RPI4 pcib

    Thanks to @kevans91 for pointing me in the right direction. FreeBSD had
    the same bug as Linux (see
    https://bugzilla.kernel.org/show_bug.cgi?id=217276) where the ultimate
    solution was to honor the brcm,enable-l1ss FDT property.

    In current versions of the dtb files this property has been added by
    default.

    Without this on many, many pcie addin cards the pcib will Serror when
    trying to assert the clreq# pin on the pcie bus. Many cards do not have
    these hooked up.

    PR:             260131, 277638, 277605
    Reviewed-by:    emaste
    Signed-off-by: HP van Braam <hp@tmm.cx>
    Pull-request: https://github.com/freebsd/freebsd-src/pull/1179

 sys/arm/broadcom/bcm2835/bcm2838_pci.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)