At least powerpc64 and powerpc64le kernels panic when copyin/copyout functions are called by external kernel modules (like pfsync, zfs and linuxulator). The panic with exception 0x480 (instruction segment exception) occurs in a context where the functions are set as pointers in cpuset_copy_cb struct. It doesn't crash when functions are called directly (without the struct) or wrapped to be called through a local function wrapper. This affects FreeBSD 13.1/STABLE and 14/CURRENT. How to reproduce: 1- Boot FreeBSD 13.1/STABLE 2- kldload pfsync Results: fatal kernel trap: exception = 0x480 (instruction segment exception) virtual address = 0x38bf00ec7fc3f378 srr0 = 0x38bf00ec7fc3f378 (0x78bf00ec7fc3f378) srr1 = 0x8000000000009032 current msr = 0x8000000000009032 lr = 0xc008000051a143f4 (0x8000051a143f4) frame = 0xc00800001b5afd50 curthread = 0xc0080000518330e0 pid = 832, comm = ifconfig panic: instruction segment exception trap cpuid = 1 time = 1664564648 KDB: stack backtrace: 0xc00800001b5af970: at kdb_backtrace+0x60 0xc00800001b5afa80: at vpanic+0x1b8 0xc00800001b5afb30: at panic+0x44 0xc00800001b5afb60: at trap+0x324 0xc00800001b5afc90: at powerpc_interrupt+0x1cc 0xc00800001b5afd20: kernel ISE trap @ 0x38bf00ec7fc3f378 by 0x38bf00ec7fc3f378: srr1=0x8000000000009032 r1=0xc00800001b5affd0 cr=0x28020a40 xer=0x20040000 ctr=0x38bf00ec7fc3f378 r2=0xc008000051a348e8 frame=0xc00800001b5afd50 0xc00800001b5affd0: at pfsyncioctl+0x368 0xc00800001b5b00f0: at ifioctl+0xc44 0xc00800001b5b0290: at soo_ioctl+0x1b4 0xc00800001b5b0320: at kern_ioctl+0x3d4 0xc00800001b5b03f0: at sys_ioctl+0x134 0xc00800001b5b0520: at syscall+0x194 0xc00800001b5b0620: at trap+0x5e8 0xc00800001b5b0750: at powerpc_interrupt+0x1cc 0xc00800001b5b07e0: user SC trap by 0x8013c5be0: srr1=0x800000000280f932 r1=0xfffffbfffe0c0 cr=0x22251682 xer=0 ctr=0x8013c5bd0 r2=0x8014a2478 frame=0xc00800001b5b0810 KDB: enter: panic [ thread pid 832 tid 100073 ] Stopped at kdb_enter+0x70: ori r0, r0, 0x0 db>
Links related to the issue: First (naive) attempt to fix: https://reviews.llvm.org/D133745 LLD reproduce tarball: https://github.com/llvm/llvm-project/issues/57722 Userland test case (try to reproduce similar issue): https://github.com/llvm/llvm-project/issues/57851 Problem isn't seen when kernel is linked with LLD9. LLVM/LLD behavior change was identified as due to commit: https://reviews.llvm.org/rGdc06b0bc9ad055d06535462d91bfc2a744b2f589 The discussion with LLVM community is still ongoing. The following temporary workaround was proposed on FreeBSD side: https://reviews.freebsd.org/D36234
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=db79bf75ac9eb1b5678ccbaebb45fb88c0e0e1e3 commit db79bf75ac9eb1b5678ccbaebb45fb88c0e0e1e3 Author: Alfredo Dal'Ava Junior <alfredo@FreeBSD.org> AuthorDate: 2022-10-03 14:51:05 +0000 Commit: Alfredo Dal'Ava Junior <alfredo@FreeBSD.org> CommitDate: 2022-10-03 15:03:09 +0000 powerpc: cpuset: add local functions for copyin/copyout Add local functions to workaround an instruction segment trap (panic) when the indirect functions copyin and copyout are called by an external loadable kernel module (i.e. pfsync, zfs and linuxulator). The crash was triggered by change 47a57144af25a7bd768b29272d50a36fdf2874ba, but kernel binary linked with LLD 9 works fine. LLVM bisect points that LLD behavior chaged after dc06b0bc9ad055d06535462d91bfc2a744b2f589. This is know to affect powerpc targets only and the final fix is still being discussed with the LLVM community. PR: 266730 Reviewed by: luporl, jhibbits (on IRC, previous version) MFC after: 2 days Sponsored by: Instituto de Pesquisas Eldorado (eldorado.org.br) Differential Revision: https://reviews.freebsd.org/D36234 sys/kern/kern_cpuset.c | 36 ++++++++++++++++++++++++++++++++++-- 1 file changed, 34 insertions(+), 2 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=05f9810b31973fb0d5f07a6eb9a12a22f81c38ad commit 05f9810b31973fb0d5f07a6eb9a12a22f81c38ad Author: Alfredo Dal'Ava Junior <alfredo@FreeBSD.org> AuthorDate: 2022-10-03 14:51:05 +0000 Commit: Alfredo Dal'Ava Junior <alfredo@FreeBSD.org> CommitDate: 2022-10-06 00:14:19 +0000 powerpc: cpuset: add local functions for copyin/copyout Add local functions to workaround an instruction segment trap (panic) when the indirect functions copyin and copyout are called by an external loadable kernel module (i.e. pfsync, zfs and linuxulator). The crash was triggered by change 47a57144af25a7bd768b29272d50a36fdf2874ba, but kernel binary linked with LLD 9 works fine. LLVM bisect points that LLD behavior chaged after dc06b0bc9ad055d06535462d91bfc2a744b2f589. This is know to affect powerpc targets only and the final fix is still being discussed with the LLVM community. PR: 266730 Reviewed by: luporl, jhibbits (on IRC, previous version) MFC after: 2 days Sponsored by: Instituto de Pesquisas Eldorado (eldorado.org.br) Differential Revision: https://reviews.freebsd.org/D36234 (cherry picked from commit db79bf75ac9eb1b5678ccbaebb45fb88c0e0e1e3) sys/kern/kern_cpuset.c | 36 ++++++++++++++++++++++++++++++++++-- 1 file changed, 34 insertions(+), 2 deletions(-)
I saw what may be the same crash on amd64 running 12.4-CURRENT, first boot after upgrading from 12.3. I started the microcode_update service and the system promptly crashed. #4 0xffffffff810d66af in trap_fatal (frame=<value optimized out>, eva=<value optimized out>) at /data/freebsd/12/sys/amd64/amd64/trap.c:921 #5 0xffffffff810d66ff in trap_pfault (frame=0xfffffe002f78f9e0, signo=<value optimized out>, ucode=<value optimized out>) at pcpu_aux.h:55 #6 0xffffffff810aec68 in calltrap () at /data/freebsd/12/sys/amd64/amd64/exception.S:289 #7 0xffffffff810d2b73 in copyout_nosmap_std () at /data/freebsd/12/sys/amd64/amd64/support.S:805 #8 0xffffffff80c29f2d in uiomove_faultflag (cp=0xfffffe002686a000, n=98, uio=0xfffffe002f78fba0, nofault=<value optimized out>) at /data/freebsd/12/sys/kern/subr_uio.c:254 #9 0xffffffff80c32333 in pipe_read (fp=0xfffff80012598550, uio=0xfffffe002f78fba0, active_cred=<value optimized out>, flags=<value optimized out>, td=<value optimized out>) at /data/freebsd/12/sys/kern/sys_pipe.c:712 #10 0xffffffff80c2f3a5 in dofileread (td=<value optimized out>, fd=0, fp=<value optimized out>, auio=0xfffffe002f78fba0, offset=<value optimized out>, flags=<value optimized out>) at file.h:317 #11 0xffffffff80c2ef20 in sys_read (td=0xfffff8001cade740, uap=Unhandled dwarf expression opcode 0xa3 ) at /data/freebsd/12/sys/kern/sys_generic.c:289 #12 0xffffffff810d7267 in amd64_syscall (td=0xfffff8001cade740, traced=0) at subr_syscall.c:144 #13 0xffffffff810af58e in fast_syscall_common () at /data/freebsd/12/sys/amd64/amd64/exception.S:582 The active process was "logger".
^Triage: assign to committer that resolved. To John F. Carr: please let us know if you are still seeing the amd64 crash on a newer version of FreeBSD.
Funny you should ask... Today I updated my amd64 13.2-STABLE system for the first time in a month. It crashed with a similar error on the first boot. A microcode update caused a page fault trying to send data to the logger. core.txt contents below. This panic hadn't happened before on this system. Maybe the update to llvm17 affected code generation. Updating CPU Microcode... Fatal trap 12: page fault while in kernel mode cpuid = 45; apic id = 3b fault virtual address = 0x388e97560000 fault code = supervisor write data, page not present instruction pointer = 0x20:0xffffffff81088c43 stack pointer = 0x28:0xfffffe03a79d7ca0 frame pointer = 0x28:0xfffffe03a79d7ca0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 145 (logger) trap number = 12 panic: page fault cpuid = 45 time = 1705192820 KDB: stack backtrace: #0 0xffffffff80c199b5 at kdb_backtrace+0x65 #1 0xffffffff80bcebf2 at vpanic+0x152 #2 0xffffffff80bce9f3 at panic+0x43 #3 0xffffffff8108c56c at trap_fatal+0x38c #4 0xffffffff8108c5d7 at trap_pfault+0x67 #5 0xffffffff81060f08 at calltrap+0x8 #6 0xffffffff80c337d5 at uiomove_faultflag+0x135 [this is a call to copyout() -jfc] #7 0xffffffff80c3de55 at pipe_read+0x2f5 #8 0xffffffff80c3a586 at dofileread+0x86 #9 0xffffffff80c3a0d2 at sys_read+0xc2 #10 0xffffffff8108ced0 at amd64_syscall+0x140 #11 0xffffffff8106181b at fast_syscall_common+0xf8 Uptime: 38s
^Triage: create new PR 276426 to hold the amd64 content.