the function ktls_kern_thread calls fpu_kern_thread. this allows the ktls thread to use floating point operations. but at this point the fpu_initialstate variable in amd64/amd64/fpu.c is not initialized. if this thread would perform fpu operations directly after calling fpu_kern_thread, a page fault will occur in fpudna when accessing fpu_initialstate. this leads to an endless boot loop. -> the fpu layer must be initialised before a kernel thread calls fpu_kern_thread or fpu_kern_enter/leave.
The issue is that fpuinitstate SYSINIT runs at SI_SUB_DRIVERS. On the one hand I'd probably like to defer creation of the KTLS thread pool until the first use, but the immediate fix is probably to init fpuinitstate earlier. I'm not sure why we don't do that in something like SI_SUB_CPU? However, I'm curious how you are managing to get the KTLS threads to try to use the FPU prior to SI_SUB_DRIVERS since they will only do crypto work in after a userland call to setsockopt() post boot to enable KTLS on a socket.
we are using a modified version of the freebsd kernel. because we do floating point operations within spinlocks a "device not available" exception may break these locks. because of this we avoid using fpu emulation for kernel threads and restore a valid state in fpu_enter and fpu_kern_thread. we also restore the fpu state within context switches to a kernel thread with FPU_KERN set. this modified kernel will not run, because ktls_kern_thread then tries to restore an uninitialized state. thus, the ktls thread itself is not the problem, but it the first hitting this situation. currently it seems not to be a problem for the freebsd 13 kernel, nevertheless its a fault.
The SI_SUB_DRIVERS for fpuinitstate was because FPU did have real ISA attachment, I am sure. Now that we only have a rudimentary PNP stub to silent some reporting(?), it probably does not matter anymore and can be changed. I believe we already had some issue with EFIRT due to late FPU initialization.
https://reviews.freebsd.org/D30512
i have applied the changes from D30512 and removed our work around in fpu_kern_thread (test for fpu_initstate != 0). the fpu state is still restored at the end of our implementation of fpu_kern_thread. with these changes the kernel is booting and the issue is gone.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=c56de177d28295b4b07ad0b17e4faf4f11c9e4f2 commit c56de177d28295b4b07ad0b17e4faf4f11c9e4f2 Author: Konstantin Belousov <kib@FreeBSD.org> AuthorDate: 2021-05-27 16:26:10 +0000 Commit: Konstantin Belousov <kib@FreeBSD.org> CommitDate: 2021-05-28 18:38:32 +0000 x86: initialize initial FPU state earlier Make it under SI_SUB_CPU sysinit, instead of much later SI_SUB_DRIVERS. The SI_SUB_DRIVERS survived from times when FPU used real ISA attachment, now it is only pnp stub claiming id. PR: 255997 Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D30512 sys/amd64/amd64/fpu.c | 2 +- sys/i386/i386/npx.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=14bc67f1e78806c683df8d7ae65d14db8216d5a3 commit 14bc67f1e78806c683df8d7ae65d14db8216d5a3 Author: Konstantin Belousov <kib@FreeBSD.org> AuthorDate: 2021-05-27 16:26:10 +0000 Commit: Konstantin Belousov <kib@FreeBSD.org> CommitDate: 2021-06-04 00:43:59 +0000 x86: initialize initial FPU state earlier PR: 255997 (cherry picked from commit c56de177d28295b4b07ad0b17e4faf4f11c9e4f2) sys/amd64/amd64/fpu.c | 2 +- sys/i386/i386/npx.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
^Triage: committed and MFCed back in 2021.