Bug 260896 - The 14-CURRENT system hangs if kern.vt.splash_cpu is set to 1 in loader.conf
Summary: The 14-CURRENT system hangs if kern.vt.splash_cpu is set to 1 in loader.conf
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: amd64 Any
: --- Affects Many People
Assignee: Mark Johnston
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-01-03 01:36 UTC by Oleg
Modified: 2022-01-28 15:38 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Oleg 2022-01-03 01:36:56 UTC
On 14-CURRENT, the system hangs if kern.vt.splash_cpu is set to 1 in loader.conf. You can see BSD logos on your screen, but the system will eventually hang. In the single-user mode, the system will fully crash a few minutes after you login, and you will have to hold the power button to shut down your computer and then press it again in order to start the computer.
This problem doesn't exist on 13-STABLE.
Comment 1 Mark Johnston freebsd_committer freebsd_triage 2022-01-05 18:15:54 UTC
I can't reproduce any problems like this on my laptop running a fresh kernel.

Where during boot does the hang occur?
Comment 2 Oleg 2022-01-05 19:36:59 UTC
I just compiled the latest kernel from the main branch and the problem is still there. If I boot into the single-user mode, then I am able to login, but the system will become fully unresponsive a few seconds later (I said "a few minutes later" in my first post, but that was incorrect).
As far as I know, the single-user mode ignores /etc/rc.conf completely, so stuff such as i915kms.ko never gets loaded.
Comment 3 Oleg 2022-01-06 19:45:14 UTC
This issue occurs on both of my computers.
Comment 4 Mark Johnston freebsd_committer freebsd_triage 2022-01-13 00:26:58 UTC
I can reproduce the issue now (laptop resets during boot shortly after starting userspace).  Seems to only occur when vt is using the efifb backend.
Comment 5 Mark Johnston freebsd_committer freebsd_triage 2022-01-19 00:16:50 UTC
https://reviews.freebsd.org/D33932

This bug has been there forever, I suspect the reason there's no reset on stable/13 is that GENERIC there doesn't have assertion checking enabled.
Comment 6 Oleg 2022-01-19 12:43:54 UTC
But even if I apply the patch that you created, the system will still crash if the splash logos are still on the screen once it's time for i915kms to take over.
Comment 7 Mark Johnston freebsd_committer freebsd_triage 2022-01-19 15:28:11 UTC
(In reply to Oleg from comment #6)
Seems so, I guess there's a second bug.  I have no problems when testing with a GENERIC-NODEBUG kernel.
Comment 8 Oleg 2022-01-19 15:33:18 UTC
(In reply to Mark Johnston from comment #7)
What do you mean? Both GENERIC and GENERIC-NODEBUG kernels will crash if 1915kms attempts to take over, but the splash logos are still on the screen.
Comment 9 Mark Johnston freebsd_committer freebsd_triage 2022-01-19 15:45:54 UTC
(In reply to Oleg from comment #8)
I'm not able to reproduce any problems after updating my kernel (main branch) and drm-kmod (master branch) sources to the latest revision, and after applying the vt patch.  Maybe there was some recent bug fix in drm-kmod that fixes this.

Which revisions are you testing exactly?
Comment 10 commit-hook freebsd_committer freebsd_triage 2022-01-19 15:55:37 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=6c7e4d72b1c964e4147831b45e0b312f6ed97cd2

commit 6c7e4d72b1c964e4147831b45e0b312f6ed97cd2
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2022-01-19 14:48:31 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2022-01-19 15:53:15 +0000

    vt: Use a taskqueue to clear splash_cpu logos

    vt_fini_logos() calls vtbuf_grow(), which reallocates the console
    window's buffer using malloc(M_WAITOK).  Because vt_fini_logos() is
    called via a callout, we end up panicking if INVARIANTS is enabled.

    Fix the problem simply by clearing the logos using a timed taskqueue.
    taskqueue_thread is formally allowed to sleep; of course, if we actually
    end up sleeping to satisfy the allocation, then we have bigger problems.

    PR:             260896
    Reviewed by:    emaste
    MFC after:      2 weeks
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D33932

 sys/dev/vt/vt_cpulogos.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)
Comment 11 Oleg 2022-01-19 16:28:19 UTC
Well, the system won't crash if I compile /usr/ports/graphics/drm-kmod, but it will crash if I compile either /usr/ports/graphics/drm-devel-kmod or /usr/ports/graphics/drm-current-kmod. This will happen with the latest kernel revision from the main branch (with your patch applied).
Comment 12 Oleg 2022-01-19 17:19:33 UTC
Okay, I guess it's not entirely correct. With drm-devel-kmod, the latest kernel from the main branch will crash if the splash logos stay on the screen for too long, but this won't happen with drm-current-kmod.
Comment 13 Mark Johnston freebsd_committer freebsd_triage 2022-01-19 18:12:56 UTC
(In reply to Oleg from comment #12)
The problem there seems to be that the LinuxKPI used for i915kms temporarily registers a dummy framebuffer, but the dummy driver fails to implement vd_setpixel, used for rendering the logo.  So there is a window where we can call a null function pointer.

This dummy framebuffer went away in https://github.com/freebsd/drm-kmod/commit/8ef0897aa92790a023f9b108753e834d59b5ffde which is why I don't see any problems.  As a workaround for older drm-kmod branches I guess we can add a dummy vd_setpixel implementation.
Comment 14 Oleg 2022-01-19 18:40:56 UTC
This is confusing because in ports, drm-devel-kmod 's version is 5.5.19, but drm-current-kmod's version is 5.4.144, yet drm-devel-kmod is the one that causes issues in this particular case.
Comment 15 Mark Johnston freebsd_committer freebsd_triage 2022-01-19 19:07:38 UTC
(In reply to Oleg from comment #14)
Could be that the bug can be triggered in both branches, but some timing difference makes it more likely in 5.5.
Comment 17 commit-hook freebsd_committer freebsd_triage 2022-01-28 10:06:45 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=9ba2b41e3d2f582c03b42067ae60d00490e4d3ea

commit 9ba2b41e3d2f582c03b42067ae60d00490e4d3ea
Author:     Emmanuel Vadot <manu@FreeBSD.org>
AuthorDate: 2022-01-28 10:05:20 +0000
Commit:     Emmanuel Vadot <manu@FreeBSD.org>
CommitDate: 2022-01-28 10:05:40 +0000

    graphics/drm-fbsd13-kmod: Update to 5.4.144.g20220128

    - Fix a potential panic when kern.vt.splash_cpu is set [1]
    - Do not depend on debugfs if not compiled with support

    PR:     260896 [1]
    MFH:    2022Q1

    Sponsored by:   Beckhoff Automation GmbH & Co. KG

 graphics/drm-fbsd13-kmod/Makefile | 4 ++--
 graphics/drm-fbsd13-kmod/distinfo | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)
Comment 18 commit-hook freebsd_committer freebsd_triage 2022-01-28 10:06:46 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=22f113cd6f87912c605da00939f3cce38c298bb9

commit 22f113cd6f87912c605da00939f3cce38c298bb9
Author:     Emmanuel Vadot <manu@FreeBSD.org>
AuthorDate: 2022-01-28 09:59:16 +0000
Commit:     Emmanuel Vadot <manu@FreeBSD.org>
CommitDate: 2022-01-28 09:59:16 +0000

    graphics/drm-current-kmod: Update to 5.4.144.g20220128

    - Fix a potential panic when kern.vt.splash_cpu is set [1]
    - Do not depend on debugfs if not compiled with support

    PR:     260896 [1]

    Sponsored by:   Beckhoff Automation GmbH & Co. KG

 graphics/drm-current-kmod/Makefile | 4 ++--
 graphics/drm-current-kmod/distinfo | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)
Comment 19 commit-hook freebsd_committer freebsd_triage 2022-01-28 10:08:47 UTC
A commit in branch 2022Q1 references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=1c4de0fd2e801e1efe01be1f7f1d03bf5cf0f827

commit 1c4de0fd2e801e1efe01be1f7f1d03bf5cf0f827
Author:     Emmanuel Vadot <manu@FreeBSD.org>
AuthorDate: 2022-01-28 10:05:20 +0000
Commit:     Emmanuel Vadot <manu@FreeBSD.org>
CommitDate: 2022-01-28 10:07:54 +0000

    graphics/drm-fbsd13-kmod: Update to 5.4.144.g20220128

    - Fix a potential panic when kern.vt.splash_cpu is set [1]
    - Do not depend on debugfs if not compiled with support

    PR:     260896 [1]
    MFH:    2022Q1

    Sponsored by:   Beckhoff Automation GmbH & Co. KG

    (cherry picked from commit 9ba2b41e3d2f582c03b42067ae60d00490e4d3ea)

 graphics/drm-fbsd13-kmod/Makefile | 4 ++--
 graphics/drm-fbsd13-kmod/distinfo | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)
Comment 20 commit-hook freebsd_committer freebsd_triage 2022-01-28 15:30:55 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=d7af180a301bd3d16b4f64860d22dacc0d32dc39

commit d7af180a301bd3d16b4f64860d22dacc0d32dc39
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2022-01-19 14:48:31 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2022-01-28 15:28:30 +0000

    vt: Use a taskqueue to clear splash_cpu logos

    vt_fini_logos() calls vtbuf_grow(), which reallocates the console
    window's buffer using malloc(M_WAITOK).  Because vt_fini_logos() is
    called via a callout, we end up panicking if INVARIANTS is enabled.

    Fix the problem simply by clearing the logos using a timed taskqueue.
    taskqueue_thread is formally allowed to sleep; of course, if we actually
    end up sleeping to satisfy the allocation, then we have bigger problems.

    PR:             260896
    Reviewed by:    emaste
    Sponsored by:   The FreeBSD Foundation

    (cherry picked from commit 6c7e4d72b1c964e4147831b45e0b312f6ed97cd2)

 sys/dev/vt/vt_cpulogos.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)
Comment 21 Mark Johnston freebsd_committer freebsd_triage 2022-01-28 15:38:46 UTC
Fixed in main and stable/13 now.