Bug 256915 - vmd: Crash on boot after ddfc9c4c59e2ea4871100d8c076adffe3af8ff21
Summary: vmd: Crash on boot after ddfc9c4c59e2ea4871100d8c076adffe3af8ff21
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: Neel Chauhan
URL: https://reviews.freebsd.org/D31071
Keywords: regression
Depends on:
Blocks:
 
Reported: 2021-07-01 00:36 UTC by Neel Chauhan
Modified: 2021-09-12 23:03 UTC (History)
5 users (show)

See Also:


Attachments
Kernel panic log (picture) (380.48 KB, image/jpeg)
2021-07-01 00:48 UTC, Neel Chauhan
no flags Details
Full kernel panic log (picture) (246.70 KB, image/jpeg)
2021-07-01 05:37 UTC, Neel Chauhan
no flags Details
Full kernel panic log (picture) - Includes pre-stack ASSERTS (411.69 KB, image/jpeg)
2021-07-03 03:26 UTC, Neel Chauhan
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Neel Chauhan freebsd_committer 2021-07-01 00:36:13 UTC
I believe after commit ddfc9c4c59e2ea4871100d8c076adffe3af8ff21, systems using vmd(8) crash while booting. This happens on TigerLake systems, especially ones like the HP Spectre x360 13t-aw200 where vmd(8) use is required to use NVMe.

I have attached a picture of the stacktrace. Sorry if I couldn't get a text file: the keyboard wasn't working when it panicked.
Comment 1 Warner Losh freebsd_committer 2021-07-01 00:41:20 UTC
(In reply to Neel Chauhan from comment #0)

I think something went wrong with the attachment, since I'm not seeing any... Can you double check?
Comment 2 Neel Chauhan freebsd_committer 2021-07-01 00:48:43 UTC
Created attachment 226144 [details]
Kernel panic log (picture)

Sorry if I couldn't attach it earlier. I thought I did, but it seems the picture was not posted because it was too large. I resized the photo to be smaller.
Comment 3 Warner Losh freebsd_committer 2021-07-01 03:37:14 UTC
Can you page up to get the top of the stack and the panic?
Comment 4 Mark Linimon freebsd_committer freebsd_triage 2021-07-01 04:21:08 UTC
^Triage: over to committer of:

newbus: Move from bus_child_{pnpinfo,location}_src to bus_child_{pnpinfo,location} with sbuf (as cited)
Comment 5 Neel Chauhan freebsd_committer 2021-07-01 05:37:12 UTC
Created attachment 226149 [details]
Full kernel panic log (picture)
Comment 6 Neel Chauhan freebsd_committer 2021-07-03 03:26:18 UTC
Created attachment 226185 [details]
Full kernel panic log (picture) - Includes pre-stack ASSERTS

Here is a picture with the pre-stack ASSERTS as well. It is blurry since I took this on the plane.
Comment 7 Warner Losh freebsd_committer 2021-07-03 07:18:24 UTC
(In reply to Neel Chauhan from comment #6)
OK. I can read this well enough, but it makes no sense to me...

Maybe it's some weird corruption when we're queueing devd events...
Can you see if it goes away if you set "hw.bus.devctl_queue=0" in your
loader.conf file?
Comment 8 Neel Chauhan freebsd_committer 2021-07-03 17:11:54 UTC
I still get the same error.
Comment 9 Warner Losh freebsd_committer 2021-07-04 01:40:05 UTC
(In reply to Neel Chauhan from comment #8)
Thanks for trying that. I have a Kaby Lake system here I'll test to see
if I see the problem there. I should know later this evening.

The only other thing I can think of is to try w/o vmd, and that's not going to get past mountroot().
Comment 10 Neel Chauhan freebsd_committer 2021-07-04 02:29:24 UTC
Sadly, vmd is forced on certain TigerLake laptops like the HP Spectre x360 13t-aw200. I am unable to disable it.

I will try updating the UEFI to see if I get an option for this.

I don't believe it is on older Intel-based laptops. I have a WhiskeyLake-based Spectre x360 as well, and get a KabyLake-R-based ThinkPad from work that dual-boots Win10 and FreeBSD, neither which use vmd.

I am just using an older kernel for now.
Comment 11 Neel Chauhan freebsd_committer 2021-07-04 20:25:26 UTC
I did some tests by a `git revert` to before ddfc9c4c59e2ea4871100d8c076adffe3af8ff21 and for some reason it still panicked. It may be an earlier commit must be responsible for this crash.
Comment 12 Neel Chauhan freebsd_committer 2021-07-05 03:00:34 UTC
After doing a `git bisect` as per the current@ mailing lists, it seems that it may something to do with LLVM. Sadly, I am no expert in compilers.

Compiling the same "good" commit with the latest LLVM gives me the same error.
Comment 13 Neel Chauhan freebsd_committer 2021-07-05 15:56:22 UTC
Interestingly, 13.0-STABLE does not suffer from this error.
Comment 14 Neel Chauhan freebsd_committer 2021-07-06 01:21:53 UTC
Good news. I figured out how to get this to work and not panic.

When we are probing for PCI buses, we apparently assert for "pci" only when we also need to assert for "vmd_bus" as well.

Phabricator: https://reviews.freebsd.org/D31071
Comment 15 Neel Chauhan freebsd_committer 2021-07-06 02:11:12 UTC
Taking this bug since I found the solution.

imp@ feel free to take back if it's appropriate.
Comment 16 commit-hook freebsd_committer 2021-07-16 02:27:13 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=ad1f608fb2f529baf028384bbe7e8fbbff5cbe23

commit ad1f608fb2f529baf028384bbe7e8fbbff5cbe23
Author:     Neel Chauhan <nc@FreeBSD.org>
AuthorDate: 2021-07-16 02:03:05 +0000
Commit:     Neel Chauhan <nc@FreeBSD.org>
CommitDate: 2021-07-16 02:26:20 +0000

    vmd: Rename vmd_bus class to pci

    This fixes a kernel panic when probing for vmd_bus on Intel TigerLake on
    14-CURRENT. Apparently, vmd_bus is a type of PCI bus, but was registered
    as a separate device class.

    PR:                     256915
    Reviewed by:            imp
    Differential Revision:  https://reviews.freebsd.org/D31071

 sys/dev/vmd/vmd_bus.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
Comment 17 Neel Chauhan freebsd_committer 2021-07-16 02:28:05 UTC
Yay! I got it working and committed.

Slightly differently than my original approach, but vmd_bus was a type of PCI but wasn't marked as that.
Comment 18 commit-hook freebsd_committer 2021-09-12 23:03:36 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=75547acf1cd5a5d0a495b32c8113f8311f0713bd

commit 75547acf1cd5a5d0a495b32c8113f8311f0713bd
Author:     Neel Chauhan <nc@FreeBSD.org>
AuthorDate: 2021-07-16 02:03:05 +0000
Commit:     Alexander Motin <mav@FreeBSD.org>
CommitDate: 2021-09-12 22:44:12 +0000

    vmd: Rename vmd_bus class to pci

    This fixes a kernel panic when probing for vmd_bus on Intel TigerLake on
    14-CURRENT. Apparently, vmd_bus is a type of PCI bus, but was registered
    as a separate device class.

    PR:                     256915
    Reviewed by:            imp
    Differential Revision:  https://reviews.freebsd.org/D31071

    (cherry picked from commit ad1f608fb2f529baf028384bbe7e8fbbff5cbe23)

 sys/dev/vmd/vmd_bus.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)