Using ktrace on a cloudabi executable sometimes hangs in such a way that it cannot be killed. FreeBSD xx 11.0-ALPHA5 FreeBSD 11.0-ALPHA5 #0 r302164: Fri Jun 24 02:51:52 UTC 2016 root@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 # kldload cloudabi # kldload cloudabi64 $ pkg info | grep cloud cloudabi-0.6 Constants, types and data structures used by CloudABI cloudabi-toolchain-1.4 C and C++ toolchain for CloudABI cloudabi-utils-0.11 Utilities for running CloudABI programs x86_64-unknown-cloudabi-cloudabi-0.6_1 cloudabi for x86_64-unknown-cloudabi x86_64-unknown-cloudabi-cloudlibc-0.40_1 cloudlibc for x86_64-unknown-cloudabi x86_64-unknown-cloudabi-compiler-rt-3.8.0_4 compiler-rt for x86_64-unknown-cloudabi x86_64-unknown-cloudabi-curl-7.49.1_2 curl for x86_64-unknown-cloudabi x86_64-unknown-cloudabi-cxx-runtime-1.0_2 cxx-runtime for x86_64-unknown-cloudabi x86_64-unknown-cloudabi-libcxx-3.8.0_9 libcxx for x86_64-unknown-cloudabi x86_64-unknown-cloudabi-libcxxabi-3.8.0_6 libcxxabi for x86_64-unknown-cloudabi x86_64-unknown-cloudabi-libressl-2.4.1_1 libressl for x86_64-unknown-cloudabi x86_64-unknown-cloudabi-libunwind-3.8.0_5 libunwind for x86_64-unknown-cloudabi x86_64-unknown-cloudabi-lua-5.3.3_2 lua for x86_64-unknown-cloudabi x86_64-unknown-cloudabi-zlib-1.2.8_11 zlib for x86_64-unknown-cloudabi $ : | ktrace /usr/local/x86_64-unknown-cloudabi/bin/lua Here is a kernel stack trace of the hung process: (kgdb) where #0 sched_switch (td=0xfffff8006a217000, newtd=0xfffff80007380a00, flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1973 #1 0xffffffff80a52a87 in mi_switch (flags=260, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:455 #2 0xffffffff80a95d27 in sleepq_switch (wchan=<value optimized out>, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:557 #3 0xffffffff80a95bf3 in sleepq_wait (wchan=0xffffffff81c34400, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:637 #4 0xffffffff809e8cc4 in _cv_wait (cvp=<value optimized out>, lock=<value optimized out>) at /usr/src/sys/kern/kern_condvar.c:144 #5 0xffffffff80aa3132 in vmem_xalloc (vm=<value optimized out>, size0=<value optimized out>, align=<value optimized out>, phase=0, nocross=<value optimized out>, minaddr=0, maxaddr=<value optimized out>, flags=8194, addrp=<value optimized out>) at /usr/src/sys/kern/subr_vmem.c:1209 #6 0xffffffff80aa2e72 in vmem_alloc (vm=0xffffffff81c34380, size=14244610048, flags=8194, addrp=0xfffffe01212959f0) at /usr/src/sys/kern/subr_vmem.c:1095 #7 0xffffffff80d2c193 in kmem_malloc (vmem=0xffffffff81c34380, size=14244610048, flags=2) at /usr/src/sys/vm/vm_kern.c:313 #8 0xffffffff80d24d46 in uma_large_malloc (size=14244610048, wait=2) at /usr/src/sys/vm/uma_core.c:1106 #9 0xffffffff80a25833 in malloc (size=<value optimized out>, mtp=0xffffffff818f0780, flags=2) at /usr/src/sys/kern/kern_malloc.c:510 #10 0xffffffff80a189ad in ktrsyscall (code=35, narg=1780576256, args=0xfffffe0121295b80) at /usr/src/sys/kern/kern_ktrace.c:451 #11 0xffffffff80eb893e in amd64_syscall (td=0xfffff8006a217000, traced=0) at subr_syscall.c:77 #12 0xffffffff80e9897b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:396 #13 0x000000000103f42b in ?? () Clearly narg is ktrsyscall is garbage. It looks like cloudabi64_fetch_syscall_args() is not filling in sa->nargs.
Created attachment 172143 [details] Properly set sa->narg Hi Michael, Thanks for reporting this bug. Can you let me know whether the attached patch fixes the problem for you? If so, I'll make sure to commit it before 11.0. Thanks, Ed
Ed, With your patch, all seems well. No hangs in 1000 tries, kdump output looks reasonable, and the size of ktrace.out is consistent from run to run. Thanks, - Michael
Perfect! Thanks for testing! As we've already entered the freeze for 11.0, I've sent out a commit approval request to re@. Will commit the patch as soon as I get the approval.
A commit references this bug: Author: ed Date: Fri Jul 8 20:09:22 UTC 2016 New revision: 302448 URL: https://svnweb.freebsd.org/changeset/base/302448 Log: Don't forget to set sa->narg for CloudABI system calls. It turns out that this value is not used within the system call code under normal conditions, except when using tracing tools like ktrace. If we forget to set this value, it is set to random garbage. This may cause ktrace to hang indefinitely, making it impossible to kill. Reported by: Michael Plass PR: 210800 MFC before: 11.0-RELEASE Changes: head/sys/amd64/cloudabi64/cloudabi64_sysvec.c head/sys/arm64/cloudabi64/cloudabi64_sysvec.c
A commit references this bug: Author: ed Date: Tue Jul 12 06:25:28 UTC 2016 New revision: 302627 URL: https://svnweb.freebsd.org/changeset/base/302627 Log: MFC r302448: Don't forget to set sa->narg for CloudABI system calls. It turns out that this value is not used within the system call code under normal conditions, except when using tracing tools like ktrace. If we forget to set this value, it is set to random garbage. This may cause ktrace to hang indefinitely, making it impossible to kill. Approved by: re@ Reported by: Michael Plass PR: 210800 Changes: _U stable/11/ stable/11/sys/amd64/cloudabi64/cloudabi64_sysvec.c stable/11/sys/arm64/cloudabi64/cloudabi64_sysvec.c
Looks like this is fully fixed now. 11.0-BETA2 should be first version to include this fix. Thanks again for reporting this issue and enjoy using CloudABI!