Bug 270707 - Installer media doesn't boot on Thinkpad T14s Gen 3 (Ryzen 7 Pro 6850U)
Summary: Installer media doesn't boot on Thinkpad T14s Gen 3 (Ryzen 7 Pro 6850U)
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.2-RELEASE
Hardware: amd64 Any
: --- Affects Some People
Assignee: freebsd-acpi (Nobody)
URL:
Keywords: install, needs-qa
Depends on:
Blocks:
 
Reported: 2023-04-08 16:05 UTC by aixdroix_OSS
Modified: 2024-03-30 12:32 UTC (History)
11 users (show)

See Also:


Attachments
Boot process log, multi-user (287.34 KB, image/jpeg)
2023-04-08 16:05 UTC, aixdroix_OSS
no flags Details
acpidump T14 Gen4 AMD (1.80 KB, application/gzip)
2024-03-08 17:01 UTC, Matthias Lanter
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description aixdroix_OSS 2023-04-08 16:05:49 UTC
Created attachment 241356 [details]
Boot process log, multi-user

Filing this bug report as I was advised to do so on IRC channel. 

Tried booting the Thinkpad T14s Gen 3 (Ryzen 7 Pro 6850U) using 13.1, 13.2 and 14.0 installation medias (USB) (Selected 13.2 for this report as it's the latest release version). The booting process gets stuck on (fans continue to work);

```
acpi_acad0: <AC Adapter> on acpi0
```

line. See the attached picture for more detail. With the verbose option enabled, the last line I see is `AcpiOsExecute: task queue not started` I've tried the latest BIOS driver from Lenovo, and an older version to see if that's the source of the problem, but I've got the same result. To rule out "defective laptop" case; I've obtained another T14S Gen 3 with 6850U and it displayed the same behavior. Also tried while on battery/AC Power; no difference.

I can test patches if needed.
Comment 1 Mina Galić freebsd_triage 2023-04-08 16:22:50 UTC
can you, as a starter, test 14.0-CURRENT?
that should give us more info, if this is something that's caught by invariants
Comment 2 aixdroix_OSS 2023-04-08 17:35:47 UTC
Tested `14.0-CURRENT-amd64-20230330` and `14.0-CURRENT-amd64-20230406`, I am getting the same result with same log output, freezes at the same step.
Comment 3 Mina Galić freebsd_triage 2023-04-08 17:44:10 UTC
can you see if a verbose boot gives more info?
Comment 4 Graham Perrin freebsd_committer freebsd_triage 2023-04-08 17:48:39 UTC
I assume that all tests are with the same computer. 

(In reply to aixdroix_OSS from comment #0)

Re: <https://forums.freebsd.org/threads/86570/> try updating the BIOS.
Comment 5 aixdroix_OSS 2023-04-08 17:52:02 UTC
Graham; no I've tested with 2 different computers, as I've written in the report (2 physically different computers with the same specs.)
Comment 6 aixdroix_OSS 2023-04-08 17:53:00 UTC
Verbose boot gives the following output;

```
[...]
atkbdc: atkbdc0 already exists; skipping it
atrtc: atrtc0 already exists; skipping it
attimer: attimer0 already exists; skipping it
sc: sc0 already exists; skipping it
isa_probe_children: probing non-PnP devices
sc0 failed to probe on isal
vga0 failed to probe on isal
pciba: allocated type 4 (0x3f0-0x3f5) for rid 0 of fdc0
pcib0: allocated type 4 (Øx3f7-0x3f7) for rid 1 of fdc0
fdc0 failed to probe at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
ppc0: cannot reserve 1/0 port range
ppc0 failed to probe at irq 7 on isaB
pcib0: allocated type 4 (0x3f8-0x3f8) for rid a of uarto
uart0 failed to probe at port 0x3f8 irq 4 on isal
pcib0: allocated type 4 (0x2f8-0x2f8) for rid 0 of uart1
AcpiOsExecute: task queue not started
```

and gets stuck/frozen.
Comment 7 Graham Perrin freebsd_committer freebsd_triage 2023-04-08 17:58:52 UTC
(In reply to aixdroix_OSS from comment #5)

Sorry, I overlooked those details. 

Side note: this Bugzilla does not recognised Markdown.
Comment 8 aixdroix_OSS 2023-04-08 19:29:09 UTC
(In reply to Graham Perrin from comment #7)

No worries, and thank you for letting me know.
Comment 9 Michael Dexter 2023-05-01 19:19:08 UTC
I am seeing this same issue on the same hardware with 13.1R, 13.2R, and for this testing, 14.0-CURRENT-amd64-20230427-60167184abd5:

1. Upgrading from BIOS 1.30 to R23ET65W 1.35 did not resolve it.

2. The last lines of the kernel messages until it stops:

...
psm0: model Generic PS/2 mouse, device ID 0
battery0: <ACPI Control Method Battery> on acpi0
acpi_acad0: <AD Adapter> on acpi0
<Block cursor, no blink, keyboard input ignored>

3. Highlights of messages during verbose boot:

...
pcib0: allocated type 3 (0xef800-0xeffff) for rid 0 of orm0
ahci_isa_identify: 0 ioport 0xc00 alloc failed
...
ahci_isa_identify: 14 ioport 0xec00 alloc failed
isa_probe_children: disabling PnP devices
...
<existing devices, skipping>
sc0 failed to probe on isa0
vga0 failed to probe on isa0
pcib0: allocated type 4 (0x3f0-0x3f5) for rid 0 of fdc0
pcib0: allocated type 4 (0x3f7-0x2f8) for rid 0 of fdc0
fdc0 failed to probe at port 0x3f0-0x3f5,0x3d7 irq 6 drq 2 on isa0
ppc0: cannot reserve I/O port range
ppc0 failed to probe at irq 7 on isa0
pcib0: allocated type 4 (0x3f8-0x3f8) for rid 0 of uart0
uart0 failed to probe at port 0x3f8 irq 4 on isa0
pcib0: allocated type 4 (0x2f8-0x2f8) for rid 0 of uart1
<Block cursor, no blink, keyboard input ignored, no mention of acpi_acad0>

4. Disabling "CPU Power Management" in BIOS under Config: Power did not help
(A user suggested that BIOS changes on a Gen 1 might help but could not recall them)

5. Possibly related?

https://forums.lenovo.com/t5/ThinkPad-11e-Windows-13-E-and/ThinkPad-E485-E585-Firmware-bug-ACPI-IVRS-table/m-p/4191484

Thank you!
Comment 10 Michael Dexter 2023-05-01 20:37:07 UTC
Updates:

I should have mentioned that my first step was disable Secure Boot to get anywhere with FreeBSD.

A suggestion on IRC from yuripv:

set debug.acpi.disabled="acad cmbat"

That now stops earlier at psm0: model Generic PS/2 mouse, device ID 0

set debug.acpi.disabled="cmbat"

Stops at the original acpi_acad0...

One tap of of the space bar gives: AcpiOsExecute: task queue not started

set debug.acpi.disabled="acad" does not work, for what it's worth.
Comment 11 Michael Dexter 2023-05-01 20:47:39 UTC
Trying FreeBSD 11.1 (or close)

...
psm0...
battery0: <ACPI Control Method Battery> on acpi0
acpi_acad0: <AC Adapter> on acpi0
amdsbwd0: <AMD FCH Rev 41h+ Watchdog Timer> at iomem 0xfed80b00-0xfed... on isa0
amdsbwd0: watchdog hardware is disabled
device_attach: amdwbwd0 attach returned 6
<Block cursor>
Comment 12 Michael Dexter 2023-05-01 21:04:33 UTC
set hint.acpi.0.disabled="1"

Results in:

panic: APIC: Could not find any APICs.
Comment 13 Michael Dexter 2023-05-01 21:23:29 UTC
set debug.acpi.disabled="all"

Gets to mountroot> with no devices available.

Disabling of all of these stops before battery0 (psm0):

"acad button cmbat cpu ec lid mwait quirks thermal timer video"
Comment 14 Michael Dexter 2023-05-01 22:23:40 UTC
As per suggestions and acpi(4):

set debug.acpi.layer="ACPI_ALL_DRIVERS ACPI_LV_ALL_EXCEPTIONS"
set debug.acpi.layer="ACPI_ALL_COMPONENTS ACPI_ALL_DRIVERS"

The results are the same in normal and verbose mode.
Comment 15 Michael Dexter 2023-05-01 23:43:10 UTC
As per another suggestion:

set debug.acpi.enable_debug_objects="1"

Same behavior, but it appears to require ACPI_DEBUG in the kernel as per:

https://people.freebsd.org/~blackend/en_US.ISO8859-1/books/handbook/ACPI-debug.html
Comment 16 Michael Dexter 2023-05-02 06:13:00 UTC
Building a kernel with 'options ACPI_DEBUG' and setting debug.acpi.level="ACPI_LV_ALL_EXCEPTIONS"

Results in a boot that ends with:

...
 exregion-059 ExSystemMemorySpaceHan: System-Memory (width 8) R/W 0 Address=00000000777770F3
...
 nsxfeval-0386 EvaluateObject     : Null handle with relative pathname [_PRW] nsxfeval-0386 EvaluateObject...
Comment 17 Michael Dexter 2023-05-02 22:16:45 UTC
(In reply to Michael Dexter from comment #14)
Correction:

debug.acpi.layer="ACPI_ALL_COMPONENTS ACPI_ALL_DRIVERS"
debug.acpi.level="ACPI_LV_ALL_EXCEPTIONS"
Comment 18 Michael Dexter 2023-05-03 22:23:01 UTC
Suggestion of the day:

set debug.acpi.disabled="thermal"

Result: Same behavior

Broad suggestion requiring parameters: set debug.acpi.avoid=""

Related, the Handbook page may be out of date with regards to debugging on 13.*/14-CURRENT:

https://people.freebsd.org/~blackend/en_US.ISO8859-1/books/handbook/ACPI-debug.html
Comment 19 Michael Dexter 2023-05-03 22:53:26 UTC
Published Handbook link (same issues):

https://people.freebsd.org/~blackend/en_US.ISO8859-1/books/handbook/ACPI-debug.html
Comment 20 Graham Perrin freebsd_committer freebsd_triage 2023-05-04 00:08:40 UTC
(In reply to Michael Dexter from comment #18)

> … the Handbook page may be out of date with regards to debugging on 13.*/14-CURRENT:
> 
> https://people.freebsd.org/~blackend/en_US.ISO8859-1/books/handbook/ACPI-debug.html

Re: bug 268980, was that 2013 edition of the book found through a search engine and if so, can you recall which engine?

In the current edition of the book: 

<https://docs.freebsd.org/en/books/handbook/book/#ACPI-submitdebug>
Comment 21 Michael Dexter 2023-05-04 18:52:20 UTC
(In reply to Graham Perrin from comment #20)
Google was the search engine but the page mentions acpi.ko either way, which appears to have been replaced with acpi_*.ko
Comment 22 Yuri Pankov freebsd_committer freebsd_triage 2023-05-04 19:24:44 UTC
(In reply to Michael Dexter from comment #21)
acpi.ko wasn't replaced by anything; ACPI is now only supported when compiled into the kernel, where now being ~ since 2010 :)
Comment 23 Michael Dexter 2023-05-04 20:41:31 UTC
(In reply to Yuri Pankov from comment #22)
The documentation may confuse people when they look for the mentioned "acpi.ko".

That said, do you have any other ideas to try to make this hardware work?
Comment 24 mfwre 2023-05-07 15:38:22 UTC
I just wanted to add that I’m having the same issue.

Lenovo P14s Gen 3 Ryzen 7
Comment 25 Hannes Hauswedell 2023-11-29 16:03:52 UTC
Any news on this? Is it worth trying FreeBSD14 on this device?

In the good old days, Thinkpads used to work quite well with the BSDs...
Comment 26 Hannes Hauswedell 2023-12-11 13:03:59 UTC
14.0-RELEASE still gets stuck on 

```
acpi_acad0: <AC Adapter> on acpi0
```

:'(
Comment 27 Matthias Lanter 2024-01-19 15:42:36 UTC
After my T14 Gen 3 <https://bsd-hardware.info/?probe=0a2c02f944> stopped working after the BIOS update last November and Lenovo replaced the mainboard twice, I can now use it again for FreeBSD tests.

Even with yesterday's snapshot, the boot process hangs at the same point.

I get to the installer, but the built-in keyboard does not work with:
set hint.uart.1.disabled="1" 

On the boot screen the keyboard is working fine. No problems with Windows 10 and Debian 12.

I can only get further with a USB keyboard.

Additional information on the BIOS version:
Lenovo does not seem to have a lucky hand with the BIOS versions. The replaced mainboard had v1.47. Version 1.49 was then temporary available on January 8, but was also withdrawn. But I was quicker and so this one is now running.

I am available for further tests.
Comment 28 Matthias Lanter 2024-01-19 16:32:40 UTC
When I do the installation using the USB keyboard and then restart, the internal keyboard has either a long delay or several characters per stroke.
Comment 29 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-03-07 18:19:37 UTC
Assign to ACPI.
Comment 30 John Baldwin freebsd_committer freebsd_triage 2024-03-08 01:06:33 UTC
Generally speaking, the last line of dmesg is not always a good indicator of the source of a hang once the kernel has finished its attach of devices.  In particular, most sysinits aside from probing devices do not generate any output on the console, so once the kernel has finished its attach of devices, the last line will just be the last device probed until single user starts.

One option that can work is to set 'debug.verbose_sysinit=1' from the loader.  This will generate a lot of output, but if the kernel hangs during boot it will print out the last SYSINIT function the kernel called before the hang.

It's not clear from the various followups though if this is one bug or many, or if disabling uart1 works only for some cases but not others?  Also, it's not clear if the internal keyboard not working is true for everyone, or only some folks.
Comment 31 Matthias Lanter 2024-03-08 16:39:04 UTC
Last lines with FreeBSD-15.0-CURRENT-amd64-20240307-8c94ed992702-268691 and 
'debug.verbose_sysinit=1':

psm: status 3c 03 01
psm: status 3c 03 01
psm: status 3c 03 01
psm: data 08 00 00
psm: status 00 00 00
psm: status 3c 03 01
psm: status 10 00 64
psm: status 00 02 64
psm: status 00 02 64
psm0: <PS/2 Mouse> irq 12 on atkbdc0
ioapic0: routing intpin 12 (ISA IRQ 12) to lapic 0 vector 55
psm0: [GIANT_LOCKED]
WARNING: Device "psm" is Giant locked and may be deleted before FreeBSD 15.0.
psm0: model Generic PS/2 mouse, device ID0-00, 3 buttons
psm0: config:00000000, flags:00000008, packet size:3
psm0: syncmask:c0, syncbits:00
battery0; <ACPI Control Metho Battery> on acpi0
AcpiOsExectue: task queue not started
acpi_acad0: <AC Adapter> on acpi0
AcpiOsExectue: task queue not started
ahc_isa_identify 0: ioport 0xc00 alloc failed
ahc_isa_identify 1: ioport 0x1c00 alloc failed
ahc_isa_identify 2: ioport 0x2c00 alloc failed
isa_probe_children: disabling PnP devices
atkbdc: atkbdc0 already exists; skipping it
atrtc: atrtc0 already exists; skipping it
attimer: attimer0 already exists; skipping it
sc: sc0 already exists; skipping it
isa_probe_children: proping non-PnP devices
sc0 failed to probe on isa0
vga0 failed to probe on isa0
fdc0 failed to probe at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
ppc0: cannot reserve I/O port range
ppc0 failed to probe at irq 7 on isa0
uart0 failed to probe at port 0x3f8 irq 4 on isa0
AcpiOsExectue: task queue not started
Comment 32 Matthias Lanter 2024-03-08 17:01:00 UTC
Created attachment 249033 [details]
acpidump T14 Gen4 AMD

Test with FreeBSD-13.3-STABLE-amd64-20240229 and 'hint.uart.1.disabled=1':
The internal keyboard is working with ~3 seconds delay, but only a singe stroke.

With FreeBSD-15.0-CURRENT-amd64-20240307 and 'hint.uart.1.disabled=1':
The internal keyboard doesn't work. Same as in comment 28.

root@mar:~ # acpidump -dt | gzip -c9 > dumpT14Gen3.gz
iast exit status = 139
Comment 33 Matthias Lanter 2024-03-08 20:56:45 UTC
(In reply to Matthias Lanter from comment #32)
Correction:
acpidump T14 Gen3 AMD, not Gen4
Comment 34 Matthias Lanter 2024-03-11 09:51:30 UTC
For a test I removed '/boot/device.hints' on the installation media and so it starts to the installer.

Unfortunately, the integrated keyboard still doesn't work.
Comment 35 Matthias Lanter 2024-03-26 14:31:28 UTC
The computer starts when hint.uart.1.at is commented out in the device.hints as
described here:
https://bugs.freebsd.org/bugzilla//show_bug.cgi?id=276011#c8

Unfortunately, the internal and external keyboards do not work in this way either.

Since there were also such problems with the keyboard and trackpad under Linux and these have now been resolved, I looked into this a little:
https://bugzilla.kernel.org/show_bug.cgi?id=216804#c18

The comment in the code: "IRQ override isn't needed on modern AMD Zen systems and
this override breaks active low IRQs on AMD Ryzen 6000 and newer systems. Skip it."

Could it be that it has something to do with that? Is there anything similar in FreeBSD?
Comment 36 Matthias Lanter 2024-03-27 16:21:08 UTC
Maybe this case should be split up.

The reference in comment 35 and further research led me to this commit:
https://cgit.freebsd.org/src/commit/?id=9a7bf07ccdc1c7d5e6b514067a5d4175cae9d56e

As can be seen in acpidump, an INT override is included:

  Type=INT Override
  BUS=0
  IRQ=1
  INTR=1
  Flags={Polarity=active-lo, Trigger=edge}

After I commented out the line 146 & 147 and recompiled the kernel, the internal keyboard works without any problems.

Shouldn't this ACPI entry be given more weight than faulty old BIOS versions, especially with the newer AMD Ryzen?
Comment 37 John Baldwin freebsd_committer freebsd_triage 2024-03-27 16:51:38 UTC
Humm, bizarre.  An active-low edge triggered interrupt doesn't make much sense, but we could make the quirk conditional on some SMBIOS strings or the like.  Good sleuthing though on figuring out the cause.  I'll have to think about how to structure the patch.  (In particular I'll have to dig in my e-mail to see if I can find any more detail about the original machine to see if I can add a quirk for it and default to trusting MADT/DSDT entries.)
Comment 38 Matthias Lanter 2024-03-30 12:32:43 UTC
I also don't know the reason why AMD does this with the Ryzen 6000 and newer.

I currently have three different AMD-based notebooks at my disposal.

Lenovo Yoga Slim 7 Pro 14ACH5 with Ryzen 7 5800H:
https://bsd-hardware.info/?probe=020e17c2f8

Lenovo ThinkPad T14 Gen 3 21CF002UMZ with Ryzen 7 Pro 6850U:
https://bsd-hardware.info/?probe=0a2c02f944

Lenovo ThinkPad T14 Gen 4 21K3CTO1WW with Ryzen 7 Pro 7840U (no Probe)

According to acpidump, the newer two have an IRQ 1 with acitve-low, the older one does not.

Since we do not know exactly how many systems still need the old patch, this is unfortunately not easy to solve.

I would introduce a parameter that enables the previous behavior, e.g.: hint.acpi.force_irq_active-high="YES"

Of course, this would have to be mentioned for existing systems before an update. 

This means that a special parameter does not have to be set for installations so that the keyboard still works not only in the boot menu but also in the installer.