Created attachment 203420 [details] emag.multiuser.dmesg Sooo, now that Packet has Ampere eMAG instances (c2.large.arm), of course someone had to try FreeBSD and of course it's me… :D tl;dr I managed to boot to multiuser with some hacks, but PCIe is busted, needs support for more ACPI stuff. Verbose boot log is attached, I'll attach ACPI tables and stuff too. --- 0. Installation I used an Ubuntu 18.04 instance, rerooted to a ramdisk ( using the method I described in https://community.online.net/t/freebsd-on-arm64/6678 ), resized the Linux partition, added a new one, loop mounted a memstick image, dd'd it onto the new partition, copied loader_lua.efi to the EFI partition, added a GRUB entry to chainload that: menuentry 'FreeBSD' { load_video insmod part_gpt insmod chain set root='hd0,gpt1' chainloader /EFI/BSD/loader_lua.efi } and used https://github.com/mkatiyar/fuse-ufs2 to modify the UFS partition from Linux. (As long as you don't copy files from the UFS partition *to itself*, it works fine lol. If you do that, it gets stuck in a 100% cpu loop) 1. Console https://reviews.freebsd.org/D19507 is needed for any UART output now that one part from there (not using the hardcoded regshift) has landed. Now we need to hardcode it again but only for PL011. But that's not all. For some reason, I'm not seeing userspace output (/dev/console) even though the ACPI node for the console was picked up: uart0: <PrimeCell UART (PL011)> iomem 0x12600000-0x12600fff irq 1 on acpi0 uart0: console (115200,n,8,1) uart0: fast interrupt uart0: PPS capture mode: DCD 2. Weird early memory access crashes EFI runtime support (specifically, enumerating efirtc) crashed in efi_call() at efi_get_time+0x50. I disabled `options EFIRT`. Then ACPI crashed in AcpiExSystemMemorySpaceHandler when reading: exfield-0369 ExReadDataFromField : FieldRead [TO]: Obj 0xfffffd0010b41980, Type 11, Buf 0xfffffd0010b62b10, ByteLen 8 exfield-0372 ExReadDataFromField : FieldRead [FROM]: BitLen 1, BitOff 6, ByteOff 0 exfldio-0395 ExAccessRegion : [READ] Region [SystemMemory:0], Width 4, ByteBase 0, Offset 0 at 000000001F10C004 I patched DSDT, removing OperationRegion CLKE from Device AHBC. The only thing that used this was Method _INI for Device I2C4, so I removed the body of that method as well. Who cares about i2c on a server :) that allowed booting to proceed. 3. PCIe is screwed up There's this interesting message for all PCI bridges: pcib0: bus end mismatch! expected 255 found 31. And some more interesting messages (for the last couple pcib's also with "I/O port window" and "bar .. failed to allocate"): pcib0: rman_reserve_resource: start=0x30000000, end=0x301fffff, count=0x200000 pcib0: pci_host_generic_core_alloc_resource FAIL: type=3, rid=32, start=0000000030000000, end=00000000301fffff, count=0000000000200000, flags=0 pcib1: failed to allocate initial memory window: 0x30000000-0x301fffff pcib0: rman_reserve_resource: start=0x14080000000, end=0x14084ffffff, count=0x5000000 PCIe cards actually don't work when these messages are present: mlx5_core0: <mlx5_core> mem 0x14082000000-0x14083ffffff at device 0.0 on pci1 mlx5_core0: ERR: Failed mapping initialization segment, aborting Looking at Ampere's page https://github.com/AmpereComputing/ampere-centos-kernel/wiki/Ampere-CentOS-Kernel-wiki it seems like Linux needed to ACPI _DMA objects and IORT named components: https://github.com/torvalds/linux/commit/4f0450af530e62b0217522cab4803b5a65dccc46 https://github.com/torvalds/linux/commit/c04ac679c6b86e4e36fbb675c6c061b4091f5810 https://github.com/torvalds/linux/commit/7ad4263980826e8b02e121af22f4f4c9103fe86d https://github.com/torvalds/linux/commit/10d8ab2c15b9ef2f46c35e7c36781399d6f2cc82
Created attachment 203421 [details] emag.acpi.tar.gz
Created attachment 203422 [details] emag.hack.dsdt.patch
Oops, forgot the PR reference. Serial quirk committed as r346228. https://svnweb.freebsd.org/changeset/base/346228
(In reply to Greg V from comment #0) > For some reason, I'm not seeing userspace output (/dev/console) even though the ACPI node for the console was picked up Your split-out review D19896 is for a /dev/console issue on Amazon EC2 UARTs, might we have a similar issue here?
New mail from Ampere engineers (they don't seem to want to sign up for bugzilla, sadly), new very helpful info about PCIe: The _DMA objects are for the SMMU, they would make "virtualization work properly" (I assume that means PCI passthrough). Since bhyvearm64 is not finished / not upstreamed, no rush for that I guess. Apparently the real problem with just using PCIe is that we're not adding the address base from the "AddressTranslation - TRA" field, so e.g. pcib1: failed to allocate initial memory window: 0x30000000-0x301fffff we should actually be accessing: _TRA+0x3000_0000 = 0x100_3000_0000 From a quick grep, I think acpi_pcib_producer_handler is where we handle this: min = res->Data.Address64.Address.Minimum; max = res->Data.Address64.Address.Maximum; So I guess it should be something like min = res->Data.Address64.Address.Minimum + res->Data.Address64.Address.Translation; max = res->Data.Address64.Address.Maximum + res->Data.Address64.Address.Translation; (for all widths) (In reply to Ed Maste from comment #4) > Your split-out review D19896 is for a /dev/console issue on Amazon EC2 UARTs, might we have a similar issue here? Nah, that one is about connecting the SPCR device with the PCI device (the Amazon UART has different memory addresses in SPCR and PCI). The PL011 on the eMAG is not PCI, it's described in ACPI and it *is* picked up as the console, as I posted: uart0: <PrimeCell UART (PL011)> iomem 0x12600000-0x12600fff irq 1 on acpi0 uart0: console (115200,n,8,1)
(In reply to Greg V from comment #5) I work for Ampere and did create a Bugzilla account - trying to learn the ropes :-). Next time will post info here vs. email. We are working on testing this.
Continuing the investigation: Reading ACPI TranslationOffset was added in review D17791 by jchandra@. It is not applied in enough places, however. The call that gets the non-translated address is pci_host_generic_core_alloc_resource(dev=pcib0, child=pcib1): pcib0: rman_reserve_resource: start=0x30000000, end=0x301fffff, count=0x200000 rman_reserve_resource_bound: <PCIe Memory> request: [0x30000000, 0x301fffff], length 0x200000, flags 0, device pcib1 rman_reserve_resource_bound: trying 0x100efffffff <0x30000000,0x1fffff> considering [0x10030000000, 0x100efffffff] s->r_start (0x10030000000) + count - 1> end (0x301fffff) no unshared regions found I'm trying to figure out where that call is, seems to be pcib_probe_windows -> pcib_probe_windows -> bus_alloc_resource. (In reply to John O'Neill from comment #6) Nice! Welcome.
err, pcib_probe_windows -> pcib_alloc_window -> bus_alloc_resource. After adding a hardcoded offset: it can reserve on pcib0, but can't manage on pcib1… pcib0: rman_reserve_resource: start=0x10030000000, end=0x100301fffff, count=0x200000 rman_reserve_resource_bound: <PCIe Memory> request: [0x10030000000, 0x100301fffff], length 0x200000, flags 0, device pcib1 rman_reserve_resource_bound: trying 0x100efffffff <0x10030000000,0x1fffff> considering [0x10030000000, 0x100efffffff] truncated region: [0x10030000000, 0x100301fffff]; size 0x200000 (requested 0x200000) candidate region: [0x10030000000, 0x100301fffff], size 0x200000 allocating from the beginning pcib0: rman_reserve_resource: 0xfffffd0010197780 rman_manage_region: <pcib1 memory window> request: start 0x10030000000, end 0x100301fffff panic: Failed to add resource to rman
Hello, I am Tuan Phan and BIOS maintainer at Ampere. I can boot FreeBSD to prompt with PCI-e supported (I am not PCI-e expect, just did a quick hack in FreeBSD, not sure it is a right way to do). Also, I have just learned FreeBSD a few day ago, so definitely may have mistakes. 1. Fix the issue with console. - I added these lines to /boot/loader.conf vfs.mountroot.timeout="10" kernels_autodetect="NO" boot_serial="YES" console="comconsole,efi" boot_multicons="YES" 2. Fix the SPCR and EFI runtime crash - I fixed SPCR in BIOS. - I removed _INI node from I2C4. It is useless node. Not sure why FreeBSD didn't happy with it. 3. Fix the PCI-e. - Here is the patch, again, not PCI-e expect so you may improve it and change it properly. diff --git a/sys/dev/pci/pci_host_generic.c b/sys/dev/pci/pci_host_generic.c index 60f06a00909..ca814a03058 100644 --- a/sys/dev/pci/pci_host_generic.c +++ b/sys/dev/pci/pci_host_generic.c @@ -359,29 +359,29 @@ generic_pcie_activate_resource(device_t dev, device_t child, int type, switch (type) { case SYS_RES_IOPORT: + case SYS_RES_MEMORY: found = 0; for (i = 0; i < MAX_RANGES_TUPLES; i++) { pci_base = sc->ranges[i].pci_base; phys_base = sc->ranges[i].phys_base; size = sc->ranges[i].size; - if ((rid > pci_base) && (rid < (pci_base + size))) { + if ((rman_get_start(r) >= pci_base) && (rman_get_start(r) < (pci_base + size))) { found = 1; break; } } if (found) { - rman_set_start(r, rman_get_start(r) + phys_base); - rman_set_end(r, rman_get_end(r) + phys_base); + rman_set_start(r, rman_get_start(r) - pci_base + phys_base); + rman_set_end(r, rman_get_end(r) - pci_base + phys_base); res = BUS_ACTIVATE_RESOURCE(device_get_parent(dev), child, type, rid, r); } else { device_printf(dev, - "Failed to activate IOPORT resource\n"); + "Failed to activate %d resource\n", type); res = 0; } break; - case SYS_RES_MEMORY: case SYS_RES_IRQ: res = BUS_ACTIVATE_RESOURCE(device_get_parent(dev), child, type, rid, r); diff --git a/sys/dev/pci/pci_host_generic_acpi.c b/sys/dev/pci/pci_host_generic_acpi.c index fa1bf4e6efc..dbc1b7fc746 100644 --- a/sys/dev/pci/pci_host_generic_acpi.c +++ b/sys/dev/pci/pci_host_generic_acpi.c @@ -297,7 +297,7 @@ pci_host_generic_acpi_attach(device_t dev) continue; /* empty range element */ if (sc->base.ranges[tuple].flags & FLAG_MEM) { error = rman_manage_region(&sc->base.mem_rman, - phys_base, phys_base + size - 1); + pci_base, pci_base + size - 1); } else if (sc->base.ranges[tuple].flags & FLAG_IO) { error = rman_manage_region(&sc->base.io_rman, pci_base + PCI_IO_WINDOW_OFFSET,
Created attachment 203803 [details] eMAG_dmesg_pcie_works
(In reply to Tuan Phan from comment #9) Excellent work, thanks! I actually tried doing this — same handling for SYS_RES_MEMORY as for SYS_RES_IOPORT there — but I wasn't smart enough to figure out the subtraction of pci_base. I see there's some initial I/O port window failures still, but it's nice that you have a NIC working! > boot_multicons="YES" Oh. It was using only the framebuffer graphical console as the main console, I thought multicons was default on arm64 for some reason *facepalm* > Fix the SPCR and EFI runtime crash hmm, I see the I2C4 thing below, but looks like you didn't get a panic on efirtc initialization either… was that also fixed in firmware? (it was crashing for me on Packet, the firmware on Packet's servers is: HVE104D-1.02 03/08/2019) > I removed _INI node from I2C4. It is useless node. Not sure why FreeBSD didn't happy with it. FreeBSD was probing all ACPI devices, and ACPICA walked into a memory fault while trying to read from that address…
(In reply to Greg V from comment #11) > hmm, I see the I2C4 thing below, but looks like you didn't get a panic on efirtc initialization either… was that also fixed in firmware? I only removed _INI, but not the whole I2C4 node. I didn't see efirtc issue, maybe different issue. The system installed in Packet is not the same system I am using. We are looking into it. > FreeBSD was probing all ACPI devices, and ACPICA walked into a memory fault while trying to read from that address… That makes sense. One more thing, our ACPI has two XHCI nodes with _CID = PNP0D10. Looks like current FreeBSD doesn't have a code to parse it. I saw it only supports EHCI ACPI.
(In reply to Greg V from comment #11) > I see there's some initial I/O port window failures still, but it's nice that you have a NIC working! Correct me if I am wrong. ARM doesn't use IO ports at all.
(In reply to Tuan Phan from comment #13) > ARM doesn't use IO ports at all. Yeah, ARM doesn't have actual IO ports, but looks like PCIe "IO" regions should be mapped into memory: https://community.nxp.com/thread/387557#comment-626470 and other ARM systems do not show these errors: https://dmesgd.nycbug.org/index.cgi?do=view&id=4798 > our ACPI has two XHCI nodes with _CID = PNP0D10. Looks like current FreeBSD doesn't have a code to parse it. Nice catch. Yeah, XHCI has typically been on PCIe on big systems (both AMD/Intel and Cavium ThunderX/2) and described by FDT on embedded systems.. That looks easy enough to add though.
wooooo I have SSH on the Packet instance! :) Patch for enabling Mellanox NIC support on aarch64: https://reviews.freebsd.org/D19983
To avoid I/O port window fails, I had to use the `rid` still for I/O port resources if (type == SYS_RES_IOPORT) { if ((rid >= pci_base) && (rid < (pci_base + size))) { found = 1; break; } } else { if ((rman_get_start(r) >= pci_base) && (rman_get_start(r) < (pci_base + size))) { found = 1; break; } } The only fails I see is on pcib12: pcib12: pci_host_generic_core_alloc_resource FAIL: type=4, rid=28, start=0000000010000000, end=0000000010000fff, count=0000000000001000, flags=0 pcib12: pci_host_generic_core_alloc_resource FAIL: type=4, rid=28, start=0000000010000000, end=0000000010000fff, count=0000000000001000, flags=3000 pcib12: pci_host_generic_core_alloc_resource FAIL: type=4, rid=28, start=0000000010000000, end=0000000010000fff, count=0000000000001000, flags=3000 pcib12: pci_host_generic_core_alloc_resource FAIL: type=4, rid=28, start=0000000000000000, end=00000000ffffffff, count=0000000000001000, flags=3000
I have a patch for ACPI XHCI: https://reviews.freebsd.org/D19986 The Packet instance has USB disabled though: Method (_STA, 0, NotSerialized) // _STA: Status { Return (0x00) } Patching the table to 0x0F results in xhci0: <Generic USB 3.0 controller> iomem 0x13800000-0x138fffff irq 5 on acpi0 panic: vm_fault_hold: fault on nofault entry, addr: 0xffff0000e1785000 — most likely because disabling USB actually detaches the controller, not just makes ACPI tell the system that it's not present :D
(hmm even the bios setup says "USB Controllers: None". Does the Lenovo server ship w/o USB at all?)
A commit references this bug: Author: emaste Date: Sat Apr 20 15:57:06 UTC 2019 New revision: 346445 URL: https://svnweb.freebsd.org/changeset/base/346445 Log: Enable ioremap for aarch64 in the LinuxKPI Required for Mellanox drivers (e.g. on Ampere eMAG at Packet.com). PR: 237055 Submitted by: Greg V <greg@unrelenting.technology> Reviewed by: hselasky Differential Revision: https://reviews.freebsd.org/D19987 Changes: head/sys/compat/linuxkpi/common/include/linux/io.h head/sys/compat/linuxkpi/common/src/linux_compat.c
CC jhb@; John can you review the PCI change in comment #9
Those aren't generic PCI changes but in the arm-specific drivers (despite the poorly chosen "generic" in the name). They are ok for now. The real fix is larger but requires proper implementation of bus_map_resource and using a real resource manager for the host bridges instead of passing requests through.
(In reply to Greg V from comment #17) > Patching the table to 0x0F results in > xhci0: <Generic USB 3.0 controller> iomem 0x13800000-0x138fffff irq 5 on acpi0 > panic: vm_fault_hold: fault on nofault entry, addr: 0xffff0000e1785000 eMAG USB controller is disabled in UEFI BIOS so force enabling it in ACPI will likely cause crashing. Some USB registers such as clock, memory access, etc. are controlled in BIOS. USB node in ACPI is just XHCI interface.
(In reply to Greg V from comment #18) > (hmm even the bios setup says "USB Controllers: None". Does the Lenovo server ship w/o USB at all?) If you see _STA = 0 then it is disabled in BIOS. You can try go to BIOS setup tab chipset/xhci controller configuration setting and enable it.
(In reply to Greg V from comment #16) > if ((rid >= pci_base) && (rid < (pci_base + size)) I am still not clear why rid can be compared to pci_base? It is an ID resource, right? In pci_host_generic_acpi.c, function pci_host_generic_acpi_attach error = rman_manage_region(&sc->base.io_rman, pci_base + PCI_IO_WINDOW_OFFSET, pci_base + PCI_IO_WINDOW_OFFSET + size - 1); We shouldn't plus PCI_IO_WINDOW_OFFSET to pci_base, should we?
(In reply to Tuan Phan from comment #23) > You can try go to BIOS setup tab chipset/xhci controller configuration setting and enable it. That tab wasn't giving me an option to enable it, or maybe I just couldn't figure it out… Either way, it would be better if you or Ed tested the XHCI patch (https://reviews.freebsd.org/D19986) because I can't exactly plug anything into the USB ports of a server on the other side of the planet :D
A commit references this bug: Author: emaste Date: Tue Apr 23 15:11:01 UTC 2019 New revision: 346598 URL: https://svnweb.freebsd.org/changeset/base/346598 Log: Enable Mellanox drivers (modules) on AArch64 Tested by Greg V with mlx5en on an Ampere eMAG instance at Packet.com on c2.large.arm (with some additional uncommitted PCIe WIP). PR: 237055 Submitted by: Greg V <greg@unrelenting.technology> Reviewed by: hselasky MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D19983 Changes: head/sys/modules/Makefile
(In reply to Greg V from comment #25) > Either way, it would be better if you or Ed tested the XHCI patch (https://reviews.freebsd.org/D19986) because I can't exactly plug anything into the USB ports of a server on the other side of the planet :D I tested the patch on my board and USB works both USB keyboard/mass storage. Thanks
(In reply to Tuan Phan from comment #27) Can you test the updated USB patch in https://reviews.freebsd.org/D19986? I applied it to my tree but was unsuccessful - As with GregV's report in PR237055 dsdt has for USB: ``` Method (_STA, 0, NotSerialized) // _STA: Status { Return (0x00) } ``` regardless of BIOS settings; I wasn't able to test this here. At boot my FW reports: SMpro FW version: 1.04 PMpro FW version: 1.04 FW date: 20190228 AMI setup utility reports Version 2.19.1268 and BIOS Version 1.02 Build Date and Time 03/08/2019 09:59:05
(In reply to Ed Maste from comment #28) > Can you test the updated USB patch in https://reviews.freebsd.org/D19986? I applied it to my tree but was unsuccessful - As with GregV's report in PR237055 dsdt has for USB: Sure, but it may take a while. We are moving to new office so all boards in LAB teared down.
(In reply to Tuan Phan from comment #24) Hi, I also don't understand what the current code is trying to achieve by comparing rid to pci_base, it don't make sense for me too. I'm working on a patch based on yours and make sure it will not break the other platform using PCI (softiron overdrive, qemu and thunderx and the only ones I think). I'll put up some reviews tonight or maybe tomorrow morning. In the meantime I've seen that the bus end number in the MCFG table is correctly set to 31 while the one in the _CRS method of each PCI device is set to 255, Tuan could you fix that in later bios releases ? Thanks.
(In reply to Emmanuel Vadot from comment #30) > In the meantime I've seen that the bus end number in the MCFG table is correctly set to 31 while the one in the _CRS method of each PCI device is set to 255, Tuan could you fix that in later bios releases ? Sure, we will fix it.
(In reply to Tuan Phan from comment #31) Also please let us know when the update makes it through to new Lenovo firmware.
(In reply to Greg V from comment #16) This just hide the problem and in fact doesn't work. The IO mapping work with PCI0 to PCI6 (acpi names) but the PCIR_IOBASEH in the PCI-PCI bridge under PCI7 contain 0x10000000. I'm not sure why or how it should map the the addresses in _CRS.
A commit references this bug: Author: andrew Date: Wed May 1 17:12:50 UTC 2019 New revision: 346996 URL: https://svnweb.freebsd.org/changeset/base/346996 Log: Restore x18 in efi_arch_leave. Some UEFI implementations trash this register and, as we use it as a platform register, the kernel doesn't save it before calling into the UEFI runtime services. As we have a copy in tpidr_el1 restore from there when exiting the EFI environment. PR: 237234, 237055 Reviewed by: manu Tested On: Ampere eMAG MFC after: 2 weeks Sponsored by: DARPA, AFRL Sponsored by: Ampere Computing (hardware) Differential Revision: https://reviews.freebsd.org/D20127 Changes: head/sys/arm64/arm64/efirt_machdep.c
Just opened https://reviews.freebsd.org/D20144 This improve the performance of ahci.
Follow up on the ACPI bug. As Greg noted the problem in on the OperationRegion in the AHBC device. When the acpica code is trying to read on the address (in the function AcpiExSystemMemorySpaceHandler in file sys/contrib/dev/acpica/components/executer/exregion.c) we get a fault. The ESR value for this fault is 0x96000410 which mean that is this a "Synchronous External abort, not on translation table walk" according to the armv8 arm. The FnV bit is set so the far register is not valid and SET is equal to 0 so it is a recoverable error. Andrew Turner (andrew@) thinks it might be a RAS exception which FreeBSD doesn't support for now. For now I have a crappy patch that just return in the AcpiExSystemMemorySpaceHandler function if the address is 0x1f10c004 or 0x1f10c000 so I can boot the system with the latest BIOS and the full acpi table and not a modified one.
(In reply to Emmanuel Vadot from comment #36) About this issue, I am wondering why access 0x1f10c004 or 0x1f10c000 causing exception? Other OS work fine in this case. Does the access happen before enabling virtual address? need memory mapping? Somehow, need to fix this issue, otherwise any ACPI nodes that access memory in _INI will have problem.
(In reply to Tuan Phan from comment #37) The ACPICA code will call AcpiOsMapMemory before accessing the region which in turns calls pmap_mapbios. If there would be something wrong in the mapping I don't think that I will get a data abort exception with a non valid address.
(In reply to Tuan Phan from comment #29) > Can you test the updated USB patch in https://reviews.freebsd.org/D19986? I applied it to my tree but was unsuccessful - As with GregV's report in PR237055 dsdt has for USB: Tested the patch. Can detect USB mass storage and keyboard. The patch is good.
(In reply to Emmanuel Vadot from comment #38) Did some debug, it was data abort exception. The address 0x1f10c004 was mapped but with normal memory cacheable attribute. It should be mapped with device memory attribute. UEFI always export it as device memory.
Just opened three new reviews that address the ACPI bugs : https://reviews.freebsd.org/D20347 https://reviews.freebsd.org/D20348 https://reviews.freebsd.org/D20349
(In reply to Emmanuel Vadot from comment #41) I recently got a Lenovo HR 350A system for my lab and want to run FreeBSD on it. Do I only need D2034[789] on top of FreeBSD head or do I need additional patches and or specific version of the firmware?
(In reply to Michael Tuexen from comment #42) My WIP tree is functional on eMAG with those three commits included; they should be sufficient. (I have a lot of other changes but they are largely userland, and some unrelated kernel changes.) Firmware info from early boot (the same eMAG that manu@ is using for development): SMpro FW version: 1.04 PMpro FW version: 1.04 FW date: 20190228 EFI version: 2.60 EFI Firmware: American Megatrends (rev 5.13)
A commit references this bug: Author: emaste Date: Fri May 24 13:39:57 UTC 2019 New revision: 348237 URL: https://svnweb.freebsd.org/changeset/base/348237 Log: MFC r346598: Enable Mellanox drivers (modules) on AArch64 PR: 237055 Submitted by: Greg V <greg@unrelenting.technology> Changes: _U stable/12/ stable/12/sys/modules/Makefile
(In reply to Ed Maste from comment #43) Thanks for the information. Will try to test this on my machine next week...
(In reply to Ed Maste from comment #43) Hi Ed, I built a FreeBSD install image based on FreeBSD head with applying D2034[789]. I can confirm that the system boots fine with such a kernel. When running the installer to install the OS on a new SSD, the installer finishes the archive extraction step and writes on the screen: Formatting /dev/ada0p1 as FAT32 Mounting ESP /dev/ada0p1 Installing loader.efi onto ESP Creating UEFI boot entry Then the system stalls... Any idea what is going wrong or what am I doing wrong?
Yes, there is a problem with the runtime efi SetVar in the firmware, see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237808 I haven't tested the new firmware yet. If you don't want to try it you could to something like https://github.com/evadot/freebsd/commit/cbf0449d2d6193e209c611dc87eed8f2bfdedd7a
(In reply to Emmanuel Vadot from comment #47) Thanks, that helps in letting the installer finish. I used your patch, not the updated firmware. Unfortunately, the kernel from disk panics on load. Likely a problem due to my way of building the image. Restarted from scratch to build the image. I'll report...
(In reply to Michael Tuexen from comment #48) You could try : https://people.freebsd.org/~manu/FreeBSD-13.0-CURRENT-arm64-aarch64-GENERIC-NODEBUG-r347932.img.xz It's a week old or something like that and it's using NODEBUG but ... Otherwise building the image is just : export TARGET_ARCH=aarch64; export TARGET=arm64 ; make buildworld/buildkernel; cd release sudo -E make memstick You need both target and target_arch for image building (I don't remember why right now ...)
(In reply to Emmanuel Vadot from comment #49) I gave it a try. It runs the installer without problems, the installed system boots and computes the ssh server keys and locks up...
(In reply to Michael Tuexen from comment #50) Where exactly ? I have some problem with sendmail being stuck in nanoslp (same problem on Thunderx2 it seems) but I can ctrl+c (that is until I look at what is the problem exactly).
(In reply to Emmanuel Vadot from comment #51) After reporting that it generated the third key. I could not CTRL-C... When the build with a debug kernel has finished, I'll try that. Possibly it provides information or even a panic.
(In reply to Michael Tuexen from comment #52) OK, I did a build with FreeBSD head of yesterday, applied * https://reviews.freebsd.org/D20347 * https://reviews.freebsd.org/D20348 * https://reviews.freebsd.org/D20349 * https://github.com/evadot/freebsd/commit/cbf0449d2d6193e209c611dc87eed8f2bfdedd7a This resulted in a working system. I checked out the sources and rebuild a GENERIC-NODEBUG kernel and it also runs. However, I had one (temporary) problem during booting. The messages on the screen where: ... Loading configured modules... /boot/entropy size=0x1000 No valid device tree blob found! WARNING! Trying to fire up the kernel, but no device blob tree found! EFI framebuffer information: addr, size 0x430000000, 0x30000 dimensions 1024 x 768 stride 1024 masks 0x00ff0000, 0x0000ff00, 0x000000ff, 0xff000000 _ Then the system was hanging. A reboot resolved the issue.
(In reply to Michael Tuexen from comment #53) Some more testing. The system is capable in doing buildworld, but it locks up a lot when booting. You can't CTRL-C it. Is there any information I could provide which would help to nail the problem down?
(In reply to Michael Tuexen from comment #54) To be clear, you mean that it frequently locks up during boot, but once booted it runs correctly?
(In reply to Ed Maste from comment #55) More testing, better description: I meant: several times it booted to the login prompt but it didn't accept input on the keyboard or over the network (ssh access) Now I have observed that sometimes it accepts input on the console, but the network (an igb card) wasn't brought up. When looking at the boot messages I do see (trans-scribed): ... pci14 <PCI bus> on pcib14 pcib15 <PCI-PCI bridge> at device 0.0 on pci14 pcib14: pci_host_generic_core_alloc_resource FAIL: type=4, rid=28, start=0000000010000000, end=000000010000fff, count=0000000000001000, flags=0 pcib15: failed to allocate initial I/O port window:0x10000000-0x10000fff pci15: <PCI bus> on pcib15 pcib16: <PCI-PCI bridge> at device 0.0 on pci15 pcib14: pci_host_generic_core_alloc_resource FAIL: type=4, rid=28, start=start=0000000010000000, end=000000010000fff, count=0000000000001000, flags=3000 pcib16: failed to allocate initial I/O port window:0x10000000-0x10000fff pci16: <PCI bus> on pcib16 pcib14: pci_host_generic_core_alloc_resource FAIL: type=4, rid=28, start=start=0000000010000000, end=000000010000fff, count=0000000000001000, flags=3000 ... acpi0: Could not update all GPEs: AE_NOT_CONFIGURED I have observed similar instabilities on an overdrive 3000 system when these kind of PCU error occurred. On the Overdrive 3000 I'm working around this by using an ethernet card which doesn't show these PCI errors (a bge card instead of igb or ix). The ampere system has an igb card (in use) and an Mellanox card (not in use). Should I try to replace them?
OK, I identified one problem: When setting the time/date via sudo date 1432 on the command line, the system locks up after a couple of seconds. This might be related to the lock up after booting problems I have seen, since I added ntpdate="YES" to my /etc/rc.conf Without this entry, the system boots fine. Can you reproduce this?
(In reply to Michael Tuexen from comment #57) I can yes, I'll add this to my stuff to resolv list :)
(In reply to Emmanuel Vadot from comment #58) Great. Thanks a lot!
(In reply to Michael Tuexen from comment #57) It hang because the same issue with SetVariable. I think you should use the latest FW which mentioned on the SetVariable issue. When you set RTC, it also use SetVariable to save timezone info.
(In reply to Michael Tuexen from comment #53) I suggest you use the latest FW and try again.
(In reply to Tuan Phan from comment #62) OK. Will try tomorrow and report.
(In reply to Michael Tuexen from comment #56) The errors on pci14-16 are not from your igb card and should not affect your card, which is probably on a far lower-numbered bus/bridge/thingy. The Mellanox CX4 cards on the Packet instances are on pci1: https://dmesgd.nycbug.org/index.cgi?do=view&id=4864 and they work perfectly fine (in a LACP aggregation, even). The same errors are showing up on pci12-14 there. (2 less buses there — HR350A vs HR330A?)
By the way, a few questions for Tuan and/or John: - is there no hardware random number generator on eMAG? I see there was on X-Gene: https://github.com/torvalds/linux/blob/master/drivers/char/hw_random/xgene-rng.c but APMC0D18 is nowhere to be found in the DSDT I got from the Packet instance.. - does the CPU boost to the 3.3GHz speed without the OS doing anything? - is there public documentation for the monitoring (temperature, frequency)/PMU etc. devices, other than the GPL'ed Linux driver code? - why is the primary part number in MIDR zero? --- also, I just realized that we're not building ipmi_acpi on aarch64, and it does build..
(In reply to Greg V from comment #65) Greg, I can answer some questions: 1. why is the primary part number in MIDR zero? => We fixed a bug that the MIDR was put to the second DWORD if you are parsing from smbios type 4? 2. I don't think we have RNG in eMag. Not sure, let John confirm with designer. 3. I believe the CPU can boost to the 3.3Ghz without media needed from OS. Not sure, let John confirm with the power management maintainer. 4. John can help you with documents if it is available or provide support from designer.
(In reply to Michael Tuexen from comment #63) I can confirm that updating the Firmware to the version provided in bug #237808 resolves the issue with setting the time (via /etc/rc.conf or manually).
(In reply to Michael Tuexen from comment #53) > /boot/entropy size=0x1000 > No valid device tree blob found! > WARNING! Trying to fire up the kernel, but no device blob tree found! > EFI framebuffer information: > addr, size 0x430000000, 0x30000 > dimensions 1024 x 768 > stride 1024 > masks 0x00ff0000, 0x0000ff00, 0x000000ff, 0xff000000 > _ > Then the system was hanging. A reboot resolved the issue. Did you see this issue with the new test FW?
(In reply to Tuan Phan from comment #68) No, I haven't. Using the new Firmware, the system runs fine (using the igb und mce interfaces). It only reports: pci13: <PCI bus> on pcib13 pcib14: <Generic PCI host controller> on acpi0 pci14: <PCI bus> on pcib14 pcib15: <PCI-PCI bridge> at device 0.0 on pci14 pcib14: pci_host_generic_core_alloc_resource FAIL: type=4, rid=28, start=0000000010000000, end=0000000010000fff, count=0000000000001000, flags=0 pcib15: failed to allocate initial I/O port window: 0x10000000-0x10000fff pci15: <PCI bus> on pcib15 pcib16: <PCI-PCI bridge> at device 0.0 on pci15 pcib14: pci_host_generic_core_alloc_resource FAIL: type=4, rid=28, start=0000000010000000, end=0000000010000fff, count=0000000000001000, flags=3000 pcib16: failed to allocate initial I/O port window: 0x10000000-0x10000fff pci16: <PCI bus> on pcib16 pcib14: pci_host_generic_core_alloc_resource FAIL: type=4, rid=28, start=0000000010000000, end=0000000010000fff, count=0000000000001000, flags=3000 vgapci0: <VGA-compatible display> port 0x1000-0x107f mem 0x30000000-0x30ffffff,0x31040000-0x3105ffff at device 0.0 on pci16 cpu0: <ACPI CPU> on acpi0 uart0: <PrimeCell UART (PL011)> iomem 0x12600000-0x12600fff irq 1 on acpi0 uart0: console (115200,n,8,1) uart1: <PrimeCell UART (PL011)> iomem 0x12610000-0x12610fff irq 2 on acpi0 acpi0: Could not update all GPEs: AE_NOT_CONFIGURED during boot. But it doesn't seem to affect the system.
I tried to enable console access via a serial line by putting boot_multicons="YES" boot_serial="YES" console="comconsole,efi" comconsole_speed="115200" into /boot/loader.conf. Is this supposed to work with FreeBSD head (r348543)? It never works on my system and sometimes the system locks up during boot. Without these entries in /boot/loader.conf I have not observed such lockups anymore. I'm running the firmware from bug #237808. From dmesg: ... cpu0: <ACPI CPU> on acpi0 uart0: <PrimeCell UART (PL011)> iomem 0x12600000-0x12600fff irq 1 on acpi0 uart0: console (115200,n,8,1) uart1: <PrimeCell UART (PL011)> iomem 0x12610000-0x12610fff irq 2 on acpi0 acpi0: Could not update all GPEs: AE_NOT_CONFIGURED ...
hm, looks like it is possible to identify the eMAG CPU: https://github.com/NetBSD/src/commit/a1feb17c3b45b52319a61e4f9c172e373b055bc2 + https://github.com/NetBSD/src/commit/74b0f2158a5c1fee10344fc3d995780a353570a2 btw, if anyone is interested in trying more stuff on eMAG (and other aarch64 HW): - AMD Radeon GPU driver https://github.com/FreeBSDDesktop/kms-drm/pull/154 - SBSA watchdog driver https://reviews.freebsd.org/D20974
Well here's a funny story… I've been experimenting with two aarch64 things: attaching the PMU (where P = Performance) via ACPI (to make pmcstat work) and building IPMI support. On my Marvell MACCHIATObin, PMU attaches on boot, the interrupt fires but very rarely, so most of the time there's nothing in pmcstat, but occasionally a couple lines did appear. That platform actually has some weirdness (the PMU interrupts a custom Marvell interrupt controller, and in ACPI mode the firmware catches that and rethrows onto the GICv2, or something like that) so it might be a firmware bug. So I've rented an Ampere eMAG instance from Packet again to try a different ACPI platform and uhh. On boot, the PMU does not attach: pmu0: rid 0 irq 23 pmu0: <Performance Monitoring Unit> irq 13 on acpi0 pmu0: could not allocate resources But when I do `kldload ipmi` (!!!): pmu0: <Performance Monitoring Unit> irq 13 on acpi0 and pmcstat does actually start working! Wait, what?! Oh. I guess it's just reprobing all the drivers on unattached devices, but it looked so bizzare at first :D Evidently, I just put the PMU too early in the attachment order (BUS_PASS_INTERRUPT + BUS_PASS_ORDER_MIDDLE). (and, ipmi does not attach because the i2c controller wasn't even attaching (https://reviews.freebsd.org/D21059), the i2c controller doesn't attach its children, and IPMI-over-i2c-described-by-ACPI is not supported anyway)
(In reply to Greg V from comment #72) What is actually weird is that attaching pmu correctly in the boot process results in a ridiculous interrupt rate slowing the system down :( # vmstat -i interrupt total rate gic0,p7: pmu0 2397246676 2342967 gic0,p11:-ic_timer0 26390337 25793 gic0,s66: uart0 502 0 gic0,s79: ahci0 2071 2
(In reply to Greg V from comment #71) A patch is in review D21314.
(In reply to Michael Tuexen from comment #74) Now committed in base r351511.
OpenBSD now has the IPMI over i2c thing: https://github.com/openbsd/src/commit/19146c2bc8b614f59695c154d0d659dca1394404 we could port that eventually
I have one eMag server (not in packet.net) Freebsd 13 current can be installed on the machine. but I have one question about shutdown, it will cause kernel crash following is my shutdown command . $ uname -a FreeBSD fbsd 13.0-CURRENT FreeBSD 13.0-CURRENT r351591 GENERIC arm64 [richliu@fbsd ~]$ sudo shutdown -h now here is crash screen shot https://imgur.com/a/nj29u5A anyone have idea to avoid it ?
(In reply to richliu from comment #77) I also have a physical machine in my lab. I'm using FreeSBD head on it and can run shutdown -p now without any problems. I recently (two weeks ago or so) updated the Firmware. Which version are you running?
(In reply to Michael Tuexen from comment #78) My eMag machine model name called Raptor, latest software version is 1.00. All BMC/UEFI/Firmware updated to this version. May I know your machine model name and version?
(In reply to richliu from comment #79) I don't know Raptor (at least in the eMag context). My machine is a Lenovo HR250A (https://amperecomputing.com/wp-content/uploads/2019/04/Lenovo_ThinkSystem_HR350A_20190409.pdf) which runs the Firmware verion 1.10. You can find the dmsg at https://dmesgd.nycbug.org/index.cgi?do=view&id=5068
(In reply to Michael Tuexen from comment #80) It should be HR350A not HR250A I think the problem is caused by I used wrong shutdown command . used usb disk to boot system, shutdown -p work on both HR350A and Raptor, appreciate your help.
For reference I completed a full Poudriere bulk build on a Lenovo HR350A. Kernel: FreeBSD 13.0-CURRENT FreeBSD 13.0-CURRENT 1d40d15b053-c262556(master) GENERIC-NODEBUG arm64 (This corresponds to r352103.) Queued Built Failed Skipped Ignored Remaining 32947 29075 131 2513 1228 0 Elapsed: 62:33:45 (There was a fairly long period of < 10 jobs finishing up at the end, with some tweaks I believe it can finish in under 60 hours.) Three packages failed after building for more than 10 hours: 131 electron4-4.2.9 devel/electron4 build/timeout 0 runaway_process 24:22:24 119 qt5-webengine-5.12.2_3 www/qt5-webengine build/timeout 62 runaway_process 24:06:09 61 llvm-devel-10.0.d20190821 devel/llvm-devel package 2 ??? 21:34:09
A commit references this bug: Author: emaste Date: Mon Sep 16 12:51:29 UTC 2019 New revision: 352388 URL: https://svnweb.freebsd.org/changeset/base/352388 Log: MFC r346445: Enable ioremap for aarch64 in the LinuxKPI Required for Mellanox drivers (e.g. on Ampere eMAG at Packet.com). PR: 237055 Submitted by: Greg V <greg@unrelenting.technology> Changes: _U stable/12/ stable/12/sys/compat/linuxkpi/common/include/linux/io.h stable/12/sys/compat/linuxkpi/common/src/linux_compat.c
Changes to MFC to stable/12: r346996 (andrew) r347343 (manu) Also, commits to MFC for ThunderX2: r340595 (jchandra) r343876 (andrew)
A commit references this bug: Author: andrew Date: Mon Sep 16 13:45:32 UTC 2019 New revision: 352395 URL: https://svnweb.freebsd.org/changeset/base/352395 Log: MFC r346996: Restore x18 in efi_arch_leave. Some UEFI implementations trash this register and, as we use it as a platform register, the kernel doesn't save it before calling into the UEFI runtime services. As we have a copy in tpidr_el1 restore from there when exiting the EFI environment. PR: 237234, 237055 Reviewed by: manu Tested On: Ampere eMAG Sponsored by: DARPA, AFRL Sponsored by: Ampere Computing (hardware) Differential Revision: https://reviews.freebsd.org/D20127 Changes: _U stable/12/ stable/12/sys/arm64/arm64/efirt_machdep.c
I believe merging the following revisions to 12.1 is necessary (but not sufficient) to boot on eMAG: r339754 Distinguish _CID match and _HID match and make lower priority probe r343860 pci_host_generic_acpi: use IORT data for MSI/MSI-X r347343 Add support for USB 3.0 XHCI via ACPI r347929 pci: ecam: Do not warn on mismatch of bus_end r347930 pci: ecam: Correctly parse memory and IO region For me releng/12.1 + these commits hangs after: NFS ROOT: 10.0.0.1/tank/export-root/arm64 igb0: link state changed to UP
(In reply to Ed Maste from comment #86) Presumably also: r343853 arm64 acpi: Add support for IORT table r343860 pci_host_generic_acpi: use IORT data for MSI/MSI-X
Is this PR still active or should it be closed. I'm running an Ampere eMAG system using head and it is pretty stable....
(In reply to Michael Tuexen from comment #88) I had hoped to MFC everything necessary for eMAG to work on 12.2, but wasn't able to get it done in time. We could keep this PR open for tracking, if we want to merge before 12.3. Otherwise IMO it can be closed.
(In reply to Ed Maste from comment #89) I guess 13.0 will be released before 12.3. I can live with Ampere systems being supported by 13.0...
I've just installed two of these machines in the FreeBSD cluster. They complain about this repeatedly: ``` uhub_reattach_port: device problem (USB_ERR_TIMEOUT), disabling port 1 uhub_reattach_port: port 1 reset failed, error=USB_ERR_TIMEOUT ``` Haven't fiddled with the configuration yet.
(In reply to Philip Paeps from comment #91) I don't see this on the machine in my lab: ---<<BOOT>>--- KDB: debugger backends: ddb KDB: current backend: ddb Copyright (c) 1992-2020 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 13.0-CURRENT #37 r368141: Sun Nov 29 12:45:12 CET 2020 root@bsd6.fh-muenster.de:/usr/obj/usr/home/tuexen/head/arm64.aarch64/sys/TCP-NODEBUG arm64 FreeBSD clang version 11.0.0 (git@github.com:llvm/llvm-project.git llvmorg-11.0.0-0-g176249bd673) VT(efifb): resolution 800x600 module firmware already present! real memory = 137168117760 (130813 MB) avail memory = 133693915136 (127500 MB) Starting CPU 1 (1) Starting CPU 2 (100) Starting CPU 3 (101) Starting CPU 4 (200) Starting CPU 5 (201) Starting CPU 6 (300) Starting CPU 7 (301) Starting CPU 8 (400) Starting CPU 9 (401) Starting CPU 10 (500) Starting CPU 11 (501) Starting CPU 12 (600) Starting CPU 13 (601) Starting CPU 14 (700) Starting CPU 15 (701) Starting CPU 16 (800) Starting CPU 17 (801) Starting CPU 18 (900) Starting CPU 19 (901) Starting CPU 20 (a00) Starting CPU 21 (a01) Starting CPU 22 (b00) Starting CPU 23 (b01) Starting CPU 24 (c00) Starting CPU 25 (c01) Starting CPU 26 (d00) Starting CPU 27 (d01) Starting CPU 28 (e00) Starting CPU 29 (e01) Starting CPU 30 (f00) Starting CPU 31 (f01) FreeBSD/SMP: Multiprocessor System Detected: 32 CPUs random: unblocking device. random: entropy device external interface MAP 92000000 mode 2 pages 2304 MAP fffc0000 mode 2 pages 64 MAP 9ff32b0000 mode 2 pages 48 MAP 9ff80f0000 mode 2 pages 16 MAP 9ff8830000 mode 2 pages 1232 MAP 9ffa540000 mode 2 pages 16 MAP 9ffcac0000 mode 2 pages 80 MAP 9ffcb10000 mode 2 pages 128 MAP 9ffcb90000 mode 2 pages 16 MAP 9ffcba0000 mode 2 pages 32 MAP 9ffcbc0000 mode 2 pages 16 MAP 9ffcbd0000 mode 2 pages 32 MAP 9ffcbf0000 mode 2 pages 16 MAP 9ffcc00000 mode 2 pages 32 MAP 9ffcc20000 mode 2 pages 16 MAP 9ffcc30000 mode 2 pages 48 MAP 9ffcc60000 mode 2 pages 16 MAP 9ffcc70000 mode 2 pages 16 MAP 9ffcc80000 mode 2 pages 32 MAP 9ffcca0000 mode 2 pages 16 MAP 9ffccb0000 mode 2 pages 16 MAP 9ffccc0000 mode 2 pages 16 MAP 9ffccd0000 mode 2 pages 1232 MAP 9ffd1a0000 mode 2 pages 48 MAP 9ffd1d0000 mode 2 pages 4112 MAP 9fffd80000 mode 2 pages 32 MAP 9fffda0000 mode 2 pages 48 MAP 10540000 mode 0 pages 16 WARNING: Device "kbd" is Giant locked and may be deleted before FreeBSD 13.0. kbd0 at kbdmux0 WARNING: Device "openfirm" is Giant locked and may be deleted before FreeBSD 13.0. acpi0: <ALASKA A M I > acpi0: Power Button (fixed) acpi0: Sleep Button (fixed) acpi0: Could not update all GPEs: AE_NOT_CONFIGURED psci0: <ARM Power State Co-ordination Interface Driver> on acpi0 gic0: <ARM Generic Interrupt Controller v3.0> iomem 0x78000000-0x7801ffff,0x78400000-0x787fffff on acpi0 its0: <ARM GIC Interrupt Translation Service> on gic0 generic_timer0: <ARM Generic Timer> irq 11,12,13 on acpi0 Timecounter "ARM MPCore Timecounter" frequency 40000000 Hz quality 1000 Event timer "ARM MPCore Eventtimer" frequency 40000000 Hz quality 1000 efirtc0: <EFI Realtime Clock> efirtc0: registered as a time-of-day clock, resolution 1.000000s ahci0: <AHCI SATA controller> iomem 0x1c000000-0x1c000fff irq 3 on acpi0 ahci0: AHCI v1.31 with 2 6Gbps ports, Port Multiplier not supported with FBS ahcich0: <AHCI channel> at channel 0 on ahci0 ahcich1: <AHCI channel> at channel 1 on ahci0 ahci1: <AHCI SATA controller> iomem 0x1c100000-0x1c100fff irq 4 on acpi0 ahci1: AHCI v1.31 with 2 6Gbps ports, Port Multiplier not supported with FBS ahcich2: <AHCI channel> at channel 0 on ahci1 ahcich3: <AHCI channel> at channel 1 on ahci1 xhci0: <Generic USB 3.0 controller> iomem 0x13800000-0x138fffff irq 5 on acpi0 xhci0: 64 bytes context size, 32-bit DMA usbus0 on xhci0 xhci1: <Generic USB 3.0 controller> iomem 0x13900000-0x139fffff irq 6 on acpi0 xhci1: 64 bytes context size, 32-bit DMA usbus1 on xhci1 acpi_button0: <Power Button> on acpi0 apei0: <ACPI Platform Error Interface> on acpi0 pcib0: <Generic PCI host controller> on acpi0 pci0: <PCI bus> on pcib0 pcib1: <PCI-PCI bridge> at device 0.0 on pci0 pci1: <PCI bus> on pcib1 pci1: <network, ethernet> at device 0.0 (no driver attached) pci1: <network, ethernet> at device 0.1 (no driver attached) pcib2: <Generic PCI host controller> on acpi0 pci2: <PCI bus> on pcib2 pcib3: <PCI-PCI bridge> at device 0.0 on pci2 pci3: <PCI bus> on pcib3 pcib4: <Generic PCI host controller> on acpi0 pci4: <PCI bus> on pcib4 pcib5: <PCI-PCI bridge> at device 0.0 on pci4 pci5: <PCI bus> on pcib5 igb0: <Intel(R) PRO/1000 PCI-Express Network Driver> mem 0x30100000-0x301fffff,0x30200000-0x30203fff at device 0.0 on pci5 igb0: Using 1024 TX descriptors and 1024 RX descriptors igb0: Using 4 RX queues 4 TX queues igb0: Using MSI-X interrupts with 5 vectors igb0: Ethernet address: 68:05:ca:92:c5:41 pcib6: <Generic PCI host controller> on acpi0 pci6: <PCI bus> on pcib6 pcib7: <PCI-PCI bridge> at device 0.0 on pci6 pci7: <PCI bus> on pcib7 pcib8: <Generic PCI host controller> on acpi0 pci8: <PCI bus> on pcib8 pcib9: <PCI-PCI bridge> at device 0.0 on pci8 pci9: <PCI bus> on pcib9 pcib10: <Generic PCI host controller> on acpi0 pci10: <PCI bus> on pcib10 pcib11: <PCI-PCI bridge> at device 0.0 on pci10 pci11: <PCI bus> on pcib11 pcib12: <Generic PCI host controller> on acpi0 pci12: <PCI bus> on pcib12 pcib13: <PCI-PCI bridge> at device 0.0 on pci12 pci13: <PCI bus> on pcib13 pcib14: <Generic PCI host controller> on acpi0 pci14: <PCI bus> on pcib14 pcib15: <PCI-PCI bridge> at device 0.0 on pci14 pcib14: Failed to translate resource 10000000-10000fff type 4 for pcib15 pcib15: failed to allocate initial I/O port window: 0x10000000-0x10000fff pci15: <PCI bus> on pcib15 pcib16: <PCI-PCI bridge> at device 0.0 on pci15 pcib14: Failed to translate resource 10000000-10000fff type 4 for pcib15 pcib16: failed to allocate initial I/O port window: 0x10000000-0x10000fff pci16: <PCI bus> on pcib16 pcib14: Failed to translate resource 10000000-10000fff type 4 for pcib15 vgapci0: <VGA-compatible display> port 0-0x7f mem 0x30000000-0x30ffffff,0x31040000-0x3105ffff at device 0.0 on pci16 cpu0: <ACPI CPU> on acpi0 uart0: <PrimeCell UART (PL011)> iomem 0x12600000-0x12600fff irq 1 on acpi0 uart0: console (115200,n,8,1) uart1: <PrimeCell UART (PL011)> iomem 0x12610000-0x12610fff irq 2 on acpi0 cryptosoft0: <software crypto> Timecounters tick every 1.000 msec Attempting to load tcp_bbr usbus0: 5.0Gbps Super Speed USB v3.0 usbus1: 5.0Gbps Super Speed USB v3.0 tcp_bbr is now available ipfw2 (+ipv6) initialized, divert loadable, nat loadable, default to accept, logging disabled TCP Hpts created 32 swi interrupt threads and bound 0 to cpus Release APs...done CPU 0: APM eMAG 8180 r3p2 affinity: 0 0 Cache Type = <64 byte D-cacheline,64 byte I-cacheline,PIPT ICache,64 byte ERG,64 byte CWG> Instruction Set Attributes 0 = <CRC32,SHA2,SHA1,AES+PMULL> Instruction Set Attributes 1 = <> Processor Features 0 = <GIC,AdvSIMD,FP,EL3,EL2,EL1 32,EL0 32> Processor Features 1 = <> Memory Model Features 0 = <TGran4,TGran64,TGran16,SNSMem,BigEnd,16bit ASID,4TB PA> Memory Model Features 1 = <8bit VMID> Memory Model Features 2 = <32bit CCIDX,48bit VA> Debug Features 0 = <2 CTX BKPTs,4 Watchpoints,6 Breakpoints,PMUv3,Debugv8> Debug Features 1 = <> Auxiliary Features 0 = <> Auxiliary Features 1 = <> CPU 1: APM eMAG 8180 r3p2 affinity: 0 1 CPU 2: APM eMAG 8180 r3p2 affinity: 1 0 CPU 3: APM eMAG 8180 r3p2 affinity: 1 1 CPU 4: APM eMAG 8180 r3p2 affinity: 2 0 CPU 5: APM eMAG 8180 r3p2 affinity: 2 1 CPU 6: APM eMAG 8180 r3p2 affinity: 3 0 CPU 7: APM eMAG 8180 r3p2 affinity: 3 1 CPU 8: APM eMAG 8180 r3p2 affinity: 4 0 CPU 9: APM eMAG 8180 r3p2 affinity: 4 1 CPU 10: APM eMAG 8180 r3p2 affinity: 5 0 CPU 11: APM eMAG 8180 r3p2 affinity: 5 1 CPU 12: APM eMAG 8180 r3p2 affinity: 6 0 CPU 13: APM eMAG 8180 r3p2 affinity: 6 1 CPU 14: APM eMAG 8180 r3p2 affinity: 7 0 CPU 15: APM eMAG 8180 r3p2 affinity: 7 1 CPU 16: APM eMAG 8180 r3p2 affinity: 8 0 CPU 17: APM eMAG 8180 r3p2 affinity: 8 1 CPU 18: APM eMAG 8180 r3p2 affinity: 9 0 CPU 19: APM eMAG 8180 r3p2 affinity: 9 1 CPU 20: APM eMAG 8180 r3p2 affinity: 10 0 CPU 21: APM eMAG 8180 r3p2 affinity: 10 1 CPU 22: APM eMAG 8180 r3p2 affinity: 11 0 CPU 23: APM eMAG 8180 r3p2 affinity: 11 1 CPU 24: APM eMAG 8180 r3p2 affinity: 12 0 CPU 25: APM eMAG 8180 r3p2 affinity: 12 1 CPU 26: APM eMAG 8180 r3p2 affinity: 13 0 CPU 27: APM eMAG 8180 r3p2 affinity: 13 1 CPU 28: APM eMAG 8180 r3p2 affinity: 14 0 CPU 29: APM eMAG 8180 r3p2 affinity: 14 1 CPU 30: APM eMAG 8180 r3p2 affinity: 15 0 CPU 31: APM eMAG 8180 r3p2 affinity: 15 1 TCP_ratelimit: Is now initialized Trying to mount root from ufs:/dev/ada0p3 [rw]... ugen0.1: <Generic XHCI root HUB> at usbus0 ugen1.1: <Generic XHCI root HUB> at usbus1 Root mount waiting for:uhub0 CAM usbus0 usbus1 on usbus0 uhub0: <Generic XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0 uhub1 on usbus1 uhub1: <Generic XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus1 ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 ada0: <Samsung SSD 860 EVO 500GB RVT02B6Q> ACS-4 ATA SATA 3.x device ada0: Serial Number S3Z2NB0M352023L ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes) ada0: Command Queueing enabled ada0: 476940MB (976773168 512 byte sectors) uhub0: 1 port with 1 removable, self powered uhub1: 1 port with 1 removable, self powered ugen1.2: <vendor 0x04b4 product 0x6560> at usbus1 uhub2 on uhub1 uhub2: <vendor 0x04b4 product 0x6560, class 9/0, rev 2.00/90.15, addr 1> on usbus1 ugen0.2: <American Megatrends Inc. Virtual Hub> at usbus0 uhub3 on uhub0 uhub3: <7-port Hub> on usbus0 Root mount waiting for: usbus0 usbus1 uhub2: 4 ports with 4 removable, self powered uhub3: 5 ports with 5 removable, self powered Root mount waiting for: usbus0 ugen0.3: <American Megatrends Inc. Virtual Cdrom Device> at usbus0 umass0 on uhub3 umass0: <Virtual Cdrom> on usbus0 cd0 at umass-sim0 bus 0 scbus4 target 0 lun 0 cd0: <AMI Virtual CDROM0 1.00> Removable CD-ROM SCSI device cd0: Serial Number AAAABBBBCCCC1 cd0: 40.000MB/s transfers cd0: Attempt to query device size failed: NOT READY, Medium not present cd0: quirks=0x10<10_BYTE_ONLY> Root mount waiting for: usbus0 ugen0.4: <American Megatrends Inc. Virtual HardDisk Device> at usbus0 umass1 on uhub3 umass1: <Virtual HardDisk> on usbus0 da0 at umass-sim1 bus 1 scbus5 target 0 lun 0 da0: <AMI Virtual HDisk0 1.00> Removable Direct Access SCSI device da0: Serial Number AAAABBBBCCCC3 da0: 40.000MB/s transfers da0: Attempt to query device size failed: NOT READY, Medium not present da0: quirks=0x2<NO_6_BYTE> ugen0.5: <American Megatrends Inc. Virtual Keyboard and Mouse> at usbus0 ukbd0 on uhub3 ukbd0: <Keyboard Interface> on usbus0 kbd1 at ukbd0 mountroot: waiting for device /dev/ada0p3... Dual Console: Video Primary, Serial Secondary lo0: link state changed to UP ums0 on uhub3 ums0: <Mouse Interface> on usbus0 ums0: 3 buttons and [Z] coordinates ID=0 igb0: link state changed to UP
It looks like they've stopped doing this now. Sorry for the noise. Our CLUSTER13 configuration was lagging quite a bit behind GENERIC. I'll keep an eye on it. If it happens again ... I'll get some more useful debugging data out.
13.0-RELEASE runs fine on these boxes without issue, certainly on recent firmware. I think there are a few remaining drivers/patches lurking out there but we could track that on the wiki, I added a page https://wiki.freebsd.org/arm/Ampere just now.