Created attachment 184914 [details] tar.gz archive of the /var/crash folder (excluding core files) Hi, during execution of a new tool "sandsifter" - here's the feature report: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221132 I get kernel panics that all look the same to me: ------------------------------------------------------------------------------ capetown.renzel.net dumped core - see /var/crash/vmcore.0 Tue Aug 1 13:57:11 CEST 2017 FreeBSD capetown.renzel.net 11.1-RELEASE FreeBSD 11.1-RELEASE #7 r321628M: Mon Jul 31 09:13:21 CEST 2017 root@capetown.renzel.net:/usr/obj/usr/src/sys/GENERIC amd64 panic: tdsendsignal(): invalid signal 0 GNU gdb (GDB) 7.12.1 [GDB v7.12.1 for FreeBSD] Copyright (C) 2017 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-portbld-freebsd11.0". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /boot/kernel/kernel...Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...done. done. Unread portion of the kernel message buffer: panic: tdsendsignal(): invalid signal 0 cpuid = 3 KDB: stack backtrace: #0 0xffffffff80aada97 at kdb_backtrace+0x67 #1 0xffffffff80a6bb76 at vpanic+0x186 #2 0xffffffff80a6b9e3 at panic+0x43 #3 0xffffffff80a71bbd at tdsendsignal+0xcbd #4 0xffffffff80a70be4 at trapsignal+0x184 #5 0xffffffff80edf3cd at trap+0x58d #6 0xffffffff80ec3671 at calltrap+0x8 Uptime: 5h3m50s Dumping 903 out of 16282 MB:..2%..11%..22%..31%..41%..52%..61%..71%..82%..91% Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /usr/lib/debug//boot/kernel/zfs.ko.debug...done. done. Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /usr/lib/debug//boot/kernel/opensolaris.ko.debug...done. done. Reading symbols from /boot/kernel/uhid.ko...Reading symbols from /usr/lib/debug//boot/kernel/uhid.ko.debug...done. done. Reading symbols from /boot/kernel/pflog.ko...Reading symbols from /usr/lib/debug//boot/kernel/pflog.ko.debug...done. done. Reading symbols from /boot/kernel/pf.ko...Reading symbols from /usr/lib/debug//boot/kernel/pf.ko.debug...done. done. __curthread () at ./machine/pcpu.h:222 222 __asm("movq %%gs:%1,%0" : "=r" (td) (kgdb) #0 __curthread () at ./machine/pcpu.h:222 #1 doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:298 #2 0xffffffff80a6b6f1 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:366 #3 0xffffffff80a6bbb0 in vpanic (fmt=<optimized out>, ap=0xfffffe0466890780) at /usr/src/sys/kern/kern_shutdown.c:759 #4 0xffffffff80a6b9e3 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:690 #5 0xffffffff80a71bbd in tdsendsignal (p=0xfffff80044340000, td=0xfffff8004433b560, sig=<optimized out>, ksi=<unavailable>) at /usr/src/sys/kern/kern_sig.c:2137 #6 0xffffffff80a70be4 in trapsignal (td=<optimized out>, ksi=<optimized out>) at /usr/src/sys/kern/kern_sig.c:2021 #7 0xffffffff80edf3cd in trap (frame=0xfffffe0466890ac0) at /usr/src/sys/amd64/amd64/trap.c:578 #8 <signal handler called> #9 0x000000080121e000 in ?? () Backtrace stopped: Cannot access memory at address 0x866800 (kgdb) ------------------------------------------------------------------------------ I was suggested to inform "kib" about that. If someone needs a core file I can upload it somewhere - but only on request because it's 1GB large...
Can you provide the minimal test case which reproduces this issue ? It might depend on the kernel configuration. With the core dump you get, load it into kgdb and print out the trap frame by doing frame 7 p/x *frame Also I am attaching a patch which covers code paths which might cause that effect, from reading of the code.
(In reply to Konstantin Belousov from comment #1) > Can you provide the minimal test case which reproduces this issue ? It might depend on the kernel configuration. - install FreeBSD 11.1-RELEASE (amd64) incl. ports tree - apply patch https://bugs.freebsd.org/bugzilla/attachment.cgi?id=184876 to ports tree - sysctl security.bsd.map_at_zero=1 - pkg install python - pkg install make - cd /usr/ports/security/sandsifter - make - cd work/sandsifter-dff63246fed84d90118441b8ba5b5d3bdd094427 - edit "siftper.py" - shebang line to "#!/usr/bin/env python" - ./sifter.py --unk --dis --len --sync --tick --save -- -P1 -t -j8 it will eventually crash. > With the core dump you get, load it into kgdb and print out the trap frame by doing > frame 7 > p/x *frame ------------------------------------------------------------------------------ root@capetown:/var/crash/#kgdb -c vmcore.0 /usr/lib/debug/boot/kernel/kernel.debug GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: panic: tdsendsignal(): invalid signal 0 cpuid = 3 KDB: stack backtrace: #0 0xffffffff80aada97 at kdb_backtrace+0x67 #1 0xffffffff80a6bb76 at vpanic+0x186 #2 0xffffffff80a6b9e3 at panic+0x43 #3 0xffffffff80a71bbd at tdsendsignal+0xcbd #4 0xffffffff80a70be4 at trapsignal+0x184 #5 0xffffffff80edf3cd at trap+0x58d #6 0xffffffff80ec3671 at calltrap+0x8 Uptime: 5h3m50s Dumping 903 out of 16282 MB:..2%..11%..22%..31%..41%..52%..61%..71%..82%..91% Reading symbols from /usr/lib/debug/boot/kernel/zfs.ko.debug...done. Loaded symbols for /usr/lib/debug/boot/kernel/zfs.ko.debug Reading symbols from /usr/lib/debug/boot/kernel/opensolaris.ko.debug...done. Loaded symbols for /usr/lib/debug/boot/kernel/opensolaris.ko.debug Reading symbols from /usr/lib/debug/boot/kernel/uhid.ko.debug...done. Loaded symbols for /usr/lib/debug/boot/kernel/uhid.ko.debug Reading symbols from /usr/lib/debug/boot/kernel/pflog.ko.debug...done. Loaded symbols for /usr/lib/debug/boot/kernel/pflog.ko.debug Reading symbols from /usr/lib/debug/boot/kernel/pf.ko.debug...done. Loaded symbols for /usr/lib/debug/boot/kernel/pf.ko.debug #0 doadump (textdump=<value optimized out>) at pcpu.h:222 222 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump (textdump=<value optimized out>) at pcpu.h:222 #1 0xffffffff80a6b6f1 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:366 #2 0xffffffff80a6bbb0 in vpanic (fmt=<value optimized out>, ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:759 #3 0xffffffff80a6b9e3 in panic (fmt=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:690 #4 0xffffffff80a71bbd in tdsendsignal (p=<value optimized out>, td=<value optimized out>, sig=<value optimized out>, ksi=<value optimized out>) at /usr/src/sys/kern/kern_sig.c:2137 #5 0xffffffff80a70be4 in trapsignal (td=<value optimized out>, ksi=<value optimized out>) at /usr/src/sys/kern/kern_sig.c:2021 #6 0xffffffff80edf3cd in trap (frame=0xfffffe0466890ac0) at /usr/src/sys/amd64/amd64/trap.c:578 #7 0xffffffff80ec3671 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236 #8 0x000000080121e000 in ?? () Previous frame inner to this frame (corrupt stack?) Current language: auto; currently minimal (kgdb) frame 6 #6 0xffffffff80edf3cd in trap (frame=0xfffffe0466890ac0) at /usr/src/sys/amd64/amd64/trap.c:578 578 trapsignal(td, &ksi); (kgdb) p/x *frame $1 = {tf_rdi = 0x0, tf_rsi = 0x0, tf_rdx = 0x0, tf_rcx = 0x0, tf_r8 = 0x0, tf_r9 = 0x0, tf_rax = 0x0, tf_rbx = 0x0, tf_rbp = 0x0, tf_r10 = 0x0, tf_r11 = 0x0, tf_r12 = 0x0, tf_r13 = 0x0, tf_r14 = 0x0, tf_r15 = 0x0, tf_trapno = 0x20, tf_fs = 0x13, tf_gs = 0x1b, tf_addr = 0x0, tf_flags = 0x1, tf_es = 0x3b, tf_ds = 0x3b, tf_err = 0x0, tf_rip = 0x80121e000, tf_cs = 0x43, tf_rflags = 0x302, tf_rsp = 0x866800, tf_ss = 0x3b} ------------------------------------------------------------------------------
Created attachment 184917 [details] Do not call trapsignal() when signal is not intended to be send. Supposed fix.
(In reply to Nils Beyer from comment #2) Well, this is too much to require from somebody to reproduce the problem. I wanted self-contained C program that demostrate the panic. Anyway, the trapframe dump is good and I think that the patch I attached should fix the issue. Please try it.
(In reply to Konstantin Belousov from comment #4) thanks for the patch; I've applied it and currently am running the "sandsifter" CPU instructions testing program again. Will let you know the result...
(In reply to Konstantin Belousov from comment #4) Actually you got C program - injector.c, few scripts and one dependency lib. I dont know why Nils install python and make by hands. Can you it reproduce in your test env? https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221132#c14
Created attachment 184945 [details] Isolated repro
A commit references this bug: Author: kib Date: Wed Aug 2 10:12:10 UTC 2017 New revision: 321919 URL: https://svnweb.freebsd.org/changeset/base/321919 Log: Do not call trapsignal() after handling usermode fault or interrupt, when a signal is not intended to be sent. The variable holding the signal number to send is left uninitialized, which sometimes triggers invalid signal checks. For NMI, a return to usermode without ast processing is done. On the other hand, for spurious dtrace probe interrupt it is usermode which triggered the interrupt, so handle it through userret() as any other fault. Reported by: Nils Beyer <nbe@renzel.net> PR: 221151 Sponsored by: The FreeBSD Foundation MFC after: 1 week Changes: head/sys/amd64/amd64/trap.c head/sys/i386/i386/trap.c
Thanks for committing...
It looks like this PR was closed prematurely as the fix was never MFC'd.