Bug 255997

Summary: fpu_kern_thread is called before fpu_initialstate has been set in ktls_kern_thread
Product: Base System Reporter: Henning Zabel <h.zabel>
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: New ---    
Severity: Affects Many People CC: emaste, jhb, kib
Priority: ---    
Version: 13.0-STABLE   
Hardware: amd64   
OS: Any   

Description Henning Zabel 2021-05-19 11:34:13 UTC
the function ktls_kern_thread calls fpu_kern_thread. this allows the ktls thread to use floating point operations. but at this point the fpu_initialstate variable in amd64/amd64/fpu.c is not initialized.

if this thread would perform fpu operations directly after calling fpu_kern_thread, a page fault will occur in fpudna when accessing fpu_initialstate.

this leads to an endless boot loop.

-> the fpu layer must be initialised before a kernel thread
calls fpu_kern_thread or fpu_kern_enter/leave.
Comment 1 John Baldwin freebsd_committer freebsd_triage 2021-05-27 15:20:47 UTC
The issue is that fpuinitstate SYSINIT runs at SI_SUB_DRIVERS.  On the one hand I'd probably like to defer creation of the KTLS thread pool until the first use, but the immediate fix is probably to init fpuinitstate earlier.  I'm not sure why we don't do that in something like SI_SUB_CPU?

However, I'm curious how you are managing to get the KTLS threads to try to use the FPU prior to SI_SUB_DRIVERS since they will only do crypto work in after a userland call to setsockopt() post boot to enable KTLS on a socket.
Comment 2 Henning Zabel 2021-05-27 15:42:02 UTC
we are using a modified version of the freebsd kernel. because we do floating point operations within spinlocks a "device not available" exception may break these locks. because of this we avoid using fpu emulation for kernel threads and restore a valid state in fpu_enter and fpu_kern_thread. we also restore the fpu state within context switches to a kernel thread with FPU_KERN set. 

this modified kernel will not run, because ktls_kern_thread then tries to restore an uninitialized state. thus, the ktls thread itself is not the problem, but it the first hitting this situation.

currently it seems not to be a problem for the freebsd 13 kernel, nevertheless its a fault.
Comment 3 Konstantin Belousov freebsd_committer freebsd_triage 2021-05-27 15:55:37 UTC
The SI_SUB_DRIVERS for fpuinitstate was because FPU did have real ISA attachment,
I am sure.  Now that we only have a rudimentary PNP stub to silent some
reporting(?), it probably does not matter anymore and can be changed.
I believe we already had some issue with EFIRT due to late FPU initialization.
Comment 4 Konstantin Belousov freebsd_committer freebsd_triage 2021-05-27 16:37:19 UTC
https://reviews.freebsd.org/D30512
Comment 5 Henning Zabel 2021-05-28 09:51:24 UTC
i have applied the changes from D30512 and removed our work around in fpu_kern_thread (test for fpu_initstate != 0). the fpu state is still restored at the end of our implementation of fpu_kern_thread. 

with these changes the kernel is booting and the issue is gone.
Comment 6 commit-hook freebsd_committer freebsd_triage 2021-05-28 18:39:41 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=c56de177d28295b4b07ad0b17e4faf4f11c9e4f2

commit c56de177d28295b4b07ad0b17e4faf4f11c9e4f2
Author:     Konstantin Belousov <kib@FreeBSD.org>
AuthorDate: 2021-05-27 16:26:10 +0000
Commit:     Konstantin Belousov <kib@FreeBSD.org>
CommitDate: 2021-05-28 18:38:32 +0000

    x86: initialize initial FPU state earlier

    Make it under SI_SUB_CPU sysinit, instead of much later SI_SUB_DRIVERS.
    The SI_SUB_DRIVERS survived from times when FPU used real ISA attachment,
    now it is only pnp stub claiming id.

    PR:     255997
    Reviewed by:    jhb
    Sponsored by:   The FreeBSD Foundation
    MFC after:      1 week
    Differential revision:  https://reviews.freebsd.org/D30512

 sys/amd64/amd64/fpu.c | 2 +-
 sys/i386/i386/npx.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)
Comment 7 commit-hook freebsd_committer freebsd_triage 2021-06-04 00:45:05 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=14bc67f1e78806c683df8d7ae65d14db8216d5a3

commit 14bc67f1e78806c683df8d7ae65d14db8216d5a3
Author:     Konstantin Belousov <kib@FreeBSD.org>
AuthorDate: 2021-05-27 16:26:10 +0000
Commit:     Konstantin Belousov <kib@FreeBSD.org>
CommitDate: 2021-06-04 00:43:59 +0000

    x86: initialize initial FPU state earlier

    PR:     255997

    (cherry picked from commit c56de177d28295b4b07ad0b17e4faf4f11c9e4f2)

 sys/amd64/amd64/fpu.c | 2 +-
 sys/i386/i386/npx.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)