I run GENERIC kernel of 12.0-STABLE FreeBSD 12.0-STABLE #1 r345004M on different hardware. On one type I must revert commit r340224, otherwise the kernel hangs at boot without giving any message. This hardware is CPU: Intel(R) Xeon(TM) CPU 3.60GHz (3591.07-MHz K8-class CPU) Origin="GenuineIntel" Id=0xf43 Family=0xf Model=0x4 Stepping=3 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x659d<SSE3,DTES64,MON,DS_CPL,EST,TM2,CNXT-ID,CX16,xTPR> AMD Features=0x20100800<SYSCALL,NX,LM> TSC: P-state invariant real memory = 8589934592 (8192 MB) avail memory = 8287981568 (7904 MB) Event timer "LAPIC" quality 100 ACPI APIC Table: <A M I OEMAPIC > FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs FreeBSD/SMP: 2 package(s) x 1 core(s) x 2 hardware threads From pciconf -lv: vgapci0@pci0:5:12:0: class=0x030000 card=0x10798086 chip=0x47521002 rev=0x27 hdr=0x00 vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]' device = 'Rage 3 [Rage XL PCI]' class = display subclass = VGA I use kern.vty=sc, but same problem with vt. Also the problem is independent of the loader (forth or lua).
Reverted commit r340224 is: MFC r339979: Add pci_early function to detect Intel stolen memory.
Can you show the output of 'pciconf -lvcb' ?
Created attachment 203352 [details] Output of pciconf -lvcb
Are you running i386 or amd64 ? Select right file sys/i386/pci/pci_cfgreg.c or sys/amd64/pci/pci_cfgreg.c, find the pci_cfgregopen() function and remove the 'case 0x3590:' line. Does it help ? Also please show the verbose dmesg from the successful boot.
Created attachment 203384 [details] Output serial console of full verbose boot I use amd64, the commit r340224 (r339979) does only affect amd64. Removing the line with 'case 0x3590' helps, kernel now boots fine. With the patch --- pci_cfgreg.c.orig 2018-11-26 16:43:04.706033000 +0100 +++ pci_cfgreg.c 2019-04-04 11:30:51.847357000 +0200 @@ -90,13 +90,15 @@ * This also implies that it can do PCIe extended config cycles. */ + printf("pci_cfgregopen called\n"); /* Check for supported chipsets */ vid = pci_cfgregread(0, 0, 0, PCIR_VENDOR, 2); + printf("pci_cfgregopen: vid=%x\n", vid); did = pci_cfgregread(0, 0, 0, PCIR_DEVICE, 2); + printf("pci_cfgregopen: vid=%x\n", did); switch (vid) { case 0x8086: switch (did) { - case 0x3590: case 0x3592: /* Intel 7520 or 7320 */ pciebar = pci_cfgregread(0, 0, 0, 0xce, 2) << 16; @@ -112,6 +114,7 @@ } } + printf("pci_cfgregopen returns\n"); return (1); } together with a 'printf(Calling pci_early_quirks())' in machdep.c and setting "debug.late_console=0" in loader.conf I got the attached output on the serial console.
(In reply to longwitz from comment #5) I think it is not the 0xce register read which causes the hang, but the need to map very large (255MB) region by chomping from virtual_avail which causes the breakage. You can recheck this by keeping your debugging printfs but reverting the removal of the case line. You dmesg shows the ""PCIe: Memory Mapped configuration base @..." line so the memory-mapped config access method works, and this is what I looked for when asking for dmesg. Please try the attached patch, if my understanding is right, it should be the proper fix.
Created attachment 203387 [details] Do not use memory-mapped config space access for PCIe on older chipsets until pmap is ready to create the mapping.
Created attachment 203388 [details] Do not use memory-mapped config space access for PCIe on older chipsets until pmap is ready to create the mapping.
I can confirm the patch based on the variable pmap_initialized works for my older hardware with vid=0x8086 and did=0x3590. The hang without the patch was in the function pcie_cfgregopen(). For my other servers with vid=0x8086 and did=0x25d8 (E5420) the check for pmap_initialized triggers also, I suppose this is ok.
I put a review to allow some more eyes on this patch. https://reviews.freebsd.org/D19833
A commit references this bug: Author: kib Date: Tue Apr 9 18:07:18 UTC 2019 New revision: 346062 URL: https://svnweb.freebsd.org/changeset/base/346062 Log: pci_cfgreg.c: Use io port config access for early boot time. Some early PCIe chipsets are explicitly listed in the white-list to enable use of the MMIO config space accesses, perhaps because ACPI tables were not reliable source of the base MCFG address at that time. For that chipsets, MCFG base was read from the known chipset MCFGbase config register. During very early stage of boot, when access to the PCI config space is performed (see e.g. pci_early_quirks.c), we cannot map 255MB of registers because the method used with pre-boot pmap overflows initial kernel page tables. Move fallback to read MCFGbase to the attachment method of the x86/legacy device, which removes code duplication, and results in the use of io accesses until MCFG is parsed or legacy attach called. For amd64, pre-initialize cfgmech with CFGMECH_1, right now we dynamically assign CFGMECH_1 to it anyway, and remove checks for CFGMECH_NONE. There is a mention in the Intel documentation for corresponding chipsets that OS must use either io port or MMIO access method, but we already break this rule by reading MCFGbase register, so one more access seems to be innocent. Reported by: longwitz@incore.de PR: 236838 Reviewed by: avg (other version), jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D19833 Changes: head/sys/amd64/pci/pci_cfgreg.c head/sys/i386/pci/pci_cfgreg.c head/sys/x86/include/pci_cfgreg.h head/sys/x86/x86/legacy.c
A commit references this bug: Author: kib Date: Tue Apr 16 17:16:19 UTC 2019 New revision: 346284 URL: https://svnweb.freebsd.org/changeset/base/346284 Log: MFC r346062: pci_cfgreg.c: Use io port config access for early boot time. PR: 236838 Changes: _U stable/12/ stable/12/sys/amd64/pci/pci_cfgreg.c stable/12/sys/i386/pci/pci_cfgreg.c stable/12/sys/x86/include/pci_cfgreg.h stable/12/sys/x86/x86/legacy.c