Bug 234676 - Boot hangs after upgrade to 12.0-RELEASE (acpi?)
Summary: Boot hangs after upgrade to 12.0-RELEASE (acpi?)
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.0-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-bugs mailing list
URL:
Keywords: crash, needs-qa, regression
Depends on:
Blocks:
 
Reported: 2019-01-07 05:50 UTC by johan van Zanten
Modified: 2019-01-19 22:21 UTC (History)
1 user (show)

See Also:


Attachments
dmesg from last successful boots, 11.2 AND 12.0 (23.46 KB, text/plain)
2019-01-07 06:57 UTC, johan van Zanten
no flags Details
acpidump output from 11.2 kernel on machine that can't boot 12.0 (23.20 KB, application/gzip)
2019-01-19 22:17 UTC, johan van Zanten
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description johan van Zanten 2019-01-07 05:50:44 UTC
Hardware:
 motherboard: MS-7607, sold as Acer Veriton S480G
 BIOS: AMI P01-A01 (no EFI)
 CPU: Intel Core2 Quad9400 (LGA775)
 boot device: SATA-connected SSD, UFS

The S480G manual says:
 ACPI specification 1.0b
 S0,S1,S2 and S5 sleep state support.

This system was running a GENERIC 11.2-RELEASE kernel without any problems.

I upgraded using the method:

freebsd-update -r 12.0-RELEASE upgrade
freebsd-update install
reboot
freebsd-update install
pkg update
pkg upgrade
reboot

^ After this *second* reboot, the system would not boot -- it hangs before mounting root; the console messages suggest it's during ACPI-related functions.
(I don't understand why the first reboot didn't also fail -- i believe the new kernel would have loaded for the first time, then.  Perhaps something significant changed during the second part of the upgrade.)

Disabling ACPI in either the BIOS or in the boot menu causes a (different) kernel panic at boot.

During a verbose boot, sometimes the last lines are these two lines are repeated five to seven times:

Table 'FACP' at 0xbdd90200
FACP: Found table at at 0xbdd90200

On other occasions, after the "FACP" lines, these also appear:

acpi0: reservation of ffc00000, 300000 (3) failed
acpi0: reservation of fee00000, 300000 (3) failed
acpi0: reservation of 0, a00000 (3) failed
acpi0: reservation of 100000, bdd00000 (3) failed


I'm going to experiment with some BIOS settings. If those don't help, i'll try a debug kernel, and then tweaking some kernel settings.
Comment 1 Kubilay Kocak freebsd_committer freebsd_triage 2019-01-07 05:55:12 UTC
Thank you for the report Johan.

If possible (and reproducible) could you please include a panic backtrace when ACPI is disabled (ideally after building/loading a debug kernel)
Comment 2 johan van Zanten 2019-01-07 06:57:46 UTC
Created attachment 200857 [details]
dmesg from last successful boots, 11.2 AND 12.0

I've booted the machine from a FreeBSD CD-ROM and collected /var/log/dmesg.today from the 12.0-RELEASE installed file system.

Please note that this file contains dmesg from two boots: the last under 11.2-RELEASE, and the first with 12.0-RELEASE.  After this 12.0-RELEASE boot, the server has been unable to boot, probably as a result of something that changed after the second "freebsd-update install" step.
Comment 3 johan van Zanten 2019-01-07 07:00:08 UTC
Hello Kubilay,

I've got a debug kernel installed, and will try using it to collect more debugging later this evening (CET). (Gotta head to ${dayjob} now.)
Comment 4 johan van Zanten 2019-01-12 19:39:37 UTC
OK, i've got a debugging kernel that includes "options  DDB" deployed to the afflicted server, and using a serial console and voila:

Loading kernel...
/boot/kernel/kernel text=0x16a9bd0 data=0x1d2088+0x76c0c0 syms=[0x8+0x17d498+0x8+0x19a4da]
Loading configured modules...
/boot/entropy size=0x1000
KDB: debugger backends: ddb
KDB: current backend: ddb
---<<BOOT>>---
APIC: Could not find any APICs.
panic: running without device atpic requires a local APIC
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffffff82665b70
vpanic() at vpanic+0x1a3/frame 0xffffffff82665bd0
panic() at panic+0x43/frame 0xffffffff82665c30
apic_init() at apic_init+0x112/frame 0xffffffff82665c50
mi_startup() at mi_startup+0x118/frame 0xffffffff82665c70
btext() at btext+0x2c
KDB: enter: panic
[ thread pid 0 tid 0 ]
Stopped at      kdb_enter+0x3b: movq    $0,kdb_why


---
I disable APIC at the boot-time menu.
Comment 5 johan van Zanten 2019-01-12 19:54:15 UTC
I noticed the traceback was similar to 222908 and so i tried adding this to /boot/loader.conf:

machdep.disable_msix_migration=1

The next boot of the 12.0-RELEASE kernel did get further before hanging.  Below is the output up to the hang of verbose boot.  I may be missing some from the top.  Sending a break (via tip's ~#) does not drop to the debugger -- it just prints one "~" the screen. :(



0x0000000000103000 - 0x00000000001fffff, 1036288 bytes (253 pages)
0x0000000002800000 - 0x00000000bdd8ffff, 3143172096 bytes (767376 pages)
0x0000000100000000 - 0x0000000231aaefff, 5128253440 bytes (1252015 pages)
avail memory = 8238747648 (7857 MB)
intel stolen mem: base rxbe000000 size 32 MB
MADT: Found CPU APIC ID 0 ACPI ID 1: enabled
SMP: Added CPU 0 (AP)
MADT: Found CPU APIC ID 1 ACPI ID 2: enabled
SMP: Added CPU 1 (AP)
MADT: Found CPU APIC ID 2 ACPI ID 3: enabled
SMP: Added CPU 2 (AP)
MADT: Found CPU APIC ID 3 ACPI ID 4: enabled
SMP: Added CPU 3 (AP)
Event timer "LAPIC" quality 100
LAPIC: ipi_wait() us multiplier 64 (r 4100088 tsc 2660056792)
ACPI APIC Table: <060410 APIC1553>
Package ID shift: 2
L2 cache ID shift: 1
L1 cache ID shift: 0
Core ID shift: 0
INTR: Adding local APIC 1 as a target
INTR: Adding local APIC 2 as a target
INTR: Adding local APIC 3 as a target
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s)
Package HW ID = 0
        Core HW ID = 0
                CPU0 (BSP): APIC ID: 0
        Core HW ID = 1
                CPU1 (AP): APIC ID: 1
        Core HW ID = 2
                CPU2 (AP): APIC ID: 2
        Core HW ID = 3
                CPU3 (AP): APIC ID: 3
APIC: CPU 0 has ACPI ID 1
APIC: CPU 1 has ACPI ID 2
APIC: CPU 2 has ACPI ID 3
APIC: CPU 3 has ACPI ID 4
x86bios:  IVT 0x000000-0x0004ff at 0xfffff80000000000
x86bios: SSEG 0x010000-0x010fff at 0xfffffe00025ec000
x86bios: EBDA 0x08e000-0x09ffff at 0xfffff8000008e000
x86bios:  ROM 0x0a0000-0x0fefff at 0xfffff800000a0000
Pentium Pro MTRR support enabled
random: read 3840 bytes from preloaded cache
random: unblocking device.
arc4random: read 32 bytes from preloaded cache
VIMAGE (virtualized network stack) enabled
ULE: setup cpu 0
ULE: setup cpu 1
ULE: setup cpu 2
ULE: setup cpu 3
ACPI: RSDP 0x00000000000FA620 000014 (v00 ACPIAM)
ACPI: RSDT 0x00000000BDD90000 000048 (v01 060410 RSDT1553 20100604 MSFT 00000097)
ACPI: FACP 0x00000000BDD90200 000084 (v01 060410 FACP1553 20100604 MSFT 00000097)
ACPI: DSDT 0x00000000BDD90440 007019 (v01 1AAAA  1AAAA000 00000000 INTL 20051117)
ACPI: FACS 0x00000000BDD9E000 000040
ACPI: APIC 0x00000000BDD90390 00006C (v01 060410 APIC1553 20100604 MSFT 00000097)
ACPI: MCFG 0x00000000BDD90400 00003C (v01 060410 OEMMCFG  20100604 MSFT 00000097)
ACPI: OEMB 0x00000000BDD9E040 000072 (v01 060410 OEMB1553 20100604 MSFT 00000097)
ACPI: HPET 0x00000000BDD9A440 000038 (v01 060410 OEMHPET  20100604 MSFT 00000097)
ACPI: ASF! 0x00000000BDD9A480 000099 (v32 LEGEND I865PASF 00000001 INTL 20051117)
ACPI: GSCI 0x00000000BDD9E0C0 002024 (v01 060410 GMCHSCI  20100604 MSFT 00000097)
ACPI: AWMI 0x00000000BDDA00F0 00004E (v01 060410 OEMB1553 20100604 MSFT 00000097)
ACPI: SSDT 0x00000000BDDA0820 000363 (v01 DpgPmm CpuPm    00000012 INTL 20051117)
MADT: Found IO APIC ID 4, Interrupt 0 at 0xfec00000
ioapic0: ver 0x20 maxredir 0x17
ioapic0: Routing external 8259A's -> intpin 0
MADT: Interrupt override: source 0, irq 2
ioapic0: Routing IRQ 0 -> intpin 2
MADT: Interrupt override: source 9, irq 9
ioapic0: intpin 9 trigger: level
ioapic0 <Version 2.0> irqs 0-23 on motherboard
lapic: Divisor 2, Frequency 166253561 Hz
cpu0 BSP:
     ID: 0x00000000   VER: 0x00050014 LDR: 0x00000000 DFR: 0xffffffff
  lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff
  timer: 0x000100ef therm: 0x00010000 err: 0x000000f0 pmc: 0x00010400
SMP: AP CPU #2 Launched!
cpu2 AP:
     ID: 0x02000000   VER: 0x00050014 LDR: 0x00000000 DFR: 0xffffffff
  lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff
  timer: 0x000100ef therm: 0x00010000 err: 0x000000f0 pmc: 0x00010400
SMP: AP CPU #3 Launched!
cpu3 AP:
     ID: 0x03000000   VER: 0x00050014 LDR: 0x00000000 DFR: 0xffffffff
  lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff
  timer: 0x000100ef therm: 0x00010000 err: 0x000000f0 pmc: 0x00010400
SMP: AP CPU #1 Launched!
cpu1 AP:
     ID: 0x01000000   VER: 0x00050014 LDR: 0x00000000 DFR: 0xffffffff
  lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff
  timer: 0x000100ef therm: 0x00010000 err: 0x000000f0 pmc: 0x00010400
TSC timecounter disables C2 and C3.
SMP: passed TSC synchronization test
TSC timecounter discards lower 1 bit(s)
Timecounter "TSC-low" frequency 1330028396 Hz quality 1000
wlan: <802.11 Link Layer>
random: entropy device external interface
snd_unit_init() u=0x00ff8000 [512] d=0x00007c00 [32] c=0x000003ff [1024]
feeder_register: snd_unit=-1 snd_maxautovchans=16 latency=5 feeder_rate_min=1 feeder_rate_max=2016000 feeder_rate_round=25
EFI systbl not available
[ath_hal] loaded
nfslock: pseudo-device
crypto: <crypto core>
tcp_log: tcp_log device
module_register_init: MOD_LOAD (vesa, 0xffffffff81119d50, 0) error 19
io: <I/O>
kbd: new array size 4
kbd1 at kbdmux0
mem: <memory>
netmap: loaded module
null: <full device, null device, zero device>
hpt27xx: RocketRAID 27xx controller driver v1.2.8
hptnr: R750/DC7280 controller driver v1.1.5
hptrr: RocketRAID 17xx/2xxx SATA controller driver v1.2
nexus0
vtvga0: <VT VGA driver> on motherboard
cryptosoft0: <software crypto> on motherboard
crypto: assign cryptosoft0 driver id 0, flags 0x6000000
crypto: cryptosoft0 registers alg 1 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 2 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 3 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 4 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 5 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 16 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 6 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 7 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 32 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 18 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 19 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 20 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 8 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 15 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 9 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 10 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 13 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 14 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 34 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 35 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 36 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 37 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 11 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 22 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 23 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 25 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 24 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 26 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 27 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 28 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 21 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 17 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 29 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 30 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 31 flags 0 maxoplen 0
crypto: cryptosoft0 registers alg 38 flags 0 maxoplen 0
acpi0: <060410 RSDT1553> on motherboard
ACPI: 2 ACPI AML tables successfully acquired and loaded
PCIe: Memory Mapped configuration base @ 0xe0000000
ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 0 vector 48
acpi0: Power Button (fixed)
acpi0: wakeup code va 0xfffffe00025ff000 pa 0x11000
Comment 6 johan van Zanten 2019-01-19 22:15:14 UTC
I've been doing a little more debugging, per https://docs.freebsd.org/doc/5.5-RELEASE/usr/share/doc/handbook/acpi-debug.html

As i can still booting from the 11.2 kernel i was previously running, i've collected ASL with "acpidump -t -d" . I get this in stderr  when running it:
acpidump: RSDT entry 3 (sig OEMB) is corrupt

I will attached the acpidump output in case it's useful; there are two errors in it:

acpi-dump-kern11.2_userland12.0-prodhoun-20190119.asl   5307:                             PLD_Revision           = 0x1,
Error    6105 -                                               Invalid object type for reserved name ^  (_PLD: found Buffer, Package required)

acpi-dump-kern11.2_userland12.0-prodhoun-20190119.asl   5475:                             PLD_Revision           = 0x1,
Error    6105 -                                               Invalid object type for reserved name ^  (_PLD: found Buffer, Package required)
Comment 7 johan van Zanten 2019-01-19 22:17:03 UTC
Created attachment 201277 [details]
acpidump output from 11.2 kernel on machine that can't boot 12.0
Comment 8 johan van Zanten 2019-01-19 22:21:30 UTC
Per the ACPI debugging docs: https://docs.freebsd.org/doc/5.5-RELEASE/usr/share/doc/handbook/acpi-debug.html

I tried adding this to loader.conf:

hint.apic.0.disabled="1" 

...and the system panics with 12.0 as well as 11.2 kernels.  (The latter boots without that line in loader.conf.)

The traceback from the panic in the 12.0 kernel (i think it was the same in 11.2):

cryptosoft0: <software crypto> on motherboard
acpi0: <060410 RSDT1553> on motherboard
acpi0: could not allocate interrupt
ACPI Error: AE_ALREADY_EXISTS, Unable to install System Control Interrupt handler (20181003/evevent-260)
acpi0: Could not enable ACPI: AE_ALREADY_EXISTS
device_attach: acpi0 attach returned 6
panic: No usable event timer found!
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffffff82665b30
vpanic() at vpanic+0x1a3/frame 0xffffffff82665b90
panic() at panic+0x43/frame 0xffffffff82665bf0
cpu_initclocks_bsp() at cpu_initclocks_bsp+0x3dd/frame 0xffffffff82665c20
cpu_initclocks() at cpu_initclocks+0x15/frame 0xffffffff82665c40
initclocks() at initclocks+0x20/frame 0xffffffff82665c50
mi_startup() at mi_startup+0x118/frame 0xffffffff82665c70
btext() at btext+0x2c
KDB: enter: panic
[ thread pid 0 tid 100000 ]
Stopped at      kdb_enter+0x3b: movq    $0,kdb_why