Created attachment 162042 [details] core.txt Assert triggered when /usr/share/dtrace/disklatency is executed. core.txt attached.
It seems that the problem has to do with the fact that args[1]->device_name is not always set in the io:::{start,end} probes. For some reason, it only causes a crash when there are multiple aggregation keys. For instance, dtrace -n 'io:::start {@[args[1]->device_name, execname] = count()}' triggers the crash for me, but dtrace -n 'io:::start {@[args[1]->device_name] = count()}' does not. I'm also not sure why this problem has only appeared recently. I've definitely run this script before: it triggered the bug fixed in r278114. The first example in the dtrace_io(4) man page also triggers this crash. (And I definitely tested it when I wrote it!) Nonetheless, reverting several recent DTrace changes doesn't make the problem go away.
Digging somewhat further, it seems the problem has to do with the encoding of array lengths. Contrary to what I thought last night, struct devstat's device_name field is an array, not a pointer. And it turns out that all the array types in the kernel's CTF info have length 0 for some reason. kgdb shows the same thing, so I'm guessing it's related to the clang 3.7 import. Interestingly, the new kgdb support has the correct array sizes: kgdb710: (kgdb) ptype struct devstat type = struct devstat { u_int sequence0; int allocated; u_int start_count; u_int end_count; struct bintime busy_from; struct { struct devstat *stqe_next; } dev_links; u_int32_t device_number; char device_name[16]; ... kgdb (base): (kgdb) ptype struct devstat type = struct devstat { u_int sequence0; int allocated; u_int start_count; u_int end_count; struct bintime busy_from; struct { struct devstat *stqe_next; } dev_links; u_int32_t device_number; char device_name[0]; ... I've fixed a few bugs related to zero-length arrays in the CTF code in the past, so it's not surprising that it's causing problems for DTrace. There are multiple problems here: we need to figure out why the array lengths are now zero, and fix DTrace to handle this case properly (i.e. without crashing the kernel).
Seems that clang switched from using DW_AT_upper_bound to DW_AT_count: < 1><0x0000294d> DW_TAG_array_type DW_AT_type <0x00000137> < 2><0x00002952> DW_TAG_subrange_type DW_AT_type <0x0000013f> DW_AT_upper_bound 16 vs. < 1><0x0000410d> DW_TAG_array_type DW_AT_type <0x000000c0> < 2><0x00004112> DW_TAG_subrange_type DW_AT_type <0x00000100> DW_AT_count 0x00000010 and sure enough, the CTF tools only recognize the former. That should be straightforward to fix.
A commit references this bug: Author: markj Date: Sat Oct 24 03:14:36 UTC 2015 New revision: 289866 URL: https://svnweb.freebsd.org/changeset/base/289866 Log: DWARF emitted by clang 3.7 encodes array sizes using the DW_AT_count attribute rather than DW_AT_upper_bound. Teach ctfconvert about this so that array type sizes are encoded correctly. PR: 203772 MFC after: 1 week Changes: head/cddl/contrib/opensolaris/tools/ctf/cvt/dwarf.c
It's working now, thanks!
A commit references this bug: Author: markj Date: Fri Nov 13 01:27:20 UTC 2015 New revision: 290738 URL: https://svnweb.freebsd.org/changeset/base/290738 Log: MFC r289866: DWARF emitted by clang 3.7 encodes array sizes using the DW_AT_count attribute rather than DW_AT_upper_bound. Teach ctfconvert about this so that array type sizes are encoded correctly. PR: 203772 Changes: _U stable/10/ stable/10/cddl/contrib/opensolaris/tools/ctf/cvt/dwarf.c
Closing. Problem fixed. Thanks!