Bug 219451 - [dtrace] Certain llquantize() parameters trigger assertion
Summary: [dtrace] Certain llquantize() parameters trigger assertion
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 11.0-STABLE
Hardware: Any Any
: --- Affects Some People
Assignee: Mark Johnston
Keywords: patch
Depends on:
Reported: 2017-05-22 09:46 UTC by Fabian Keil
Modified: 2017-08-30 02:16 UTC (History)
4 users (show)

See Also:
koobs: mfc-stable11+

libdtrace: Prevent an assertion from triggering with certain llquantize() parameters (5.04 KB, patch)
2017-05-22 09:46 UTC, Fabian Keil
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Fabian Keil 2017-05-22 09:46:18 UTC
Created attachment 182796 [details]
libdtrace: Prevent an assertion from triggering with certain llquantize() parameters

On a system based on r318579/0c33b79a4 the following dtrace command reliably triggers an
assertion when printing output:

         fk@t520 ~ $sudo dtrace -n 'syscall::read:return /execname == "privoxy"/ { @[execname] = llquantize(arg0, 100, 0, 10, 100); @m = max(arg0)}'
                     9800 |                                         0
                     9900 |                                         0
                    10000 |@@@@@@@@@@@@@@@@@@@@                     37
                    20000 |                                         0
         Assertion failed: (value < next), file /usr/src/cddl/lib/libdtrace/../../../cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume.c, line 1083.
         Abort trap
         (gdb) where
         #0  0x00000008011effda in thr_kill () from /lib/libc.so.7
         #1  0x00000008011effa4 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52
         #2  0x00000008011eff19 in abort () at /usr/src/lib/libc/stdlib/abort.c:65
         #3  0x000000080088c3b2 in __assert (expr=0x8008d3172 "value < next", file=0x8008d3078 "/usr/src/cddl/lib/libdtrace/../../../cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume.c", line=1083)
             at /usr/src/cddl/lib/libdtrace/../../../cddl/compat/opensolaris/include/assert.h:56
         #4  0x000000080088c190 in dt_print_llquantize (dtp=0x802633000, fp=0x8014c37e8, addr=0x80269a110, size=7840, normal=1) at /usr/src/cddl/lib/libdtrace/../../../cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume
         #5  0x000000080088e37d in dt_print_datum (dtp=0x802633000, fp=0x8014c37e8, rec=0x8026900e8, addr=0x80269a110 "d", size=7848, aggdata=0x802690150, normal=1, pd=0x7fffffffe750)
             at /usr/src/cddl/lib/libdtrace/../../../cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume.c:2211
         #6  0x000000080088dc12 in dt_print_aggs (aggsdata=0x7fffffffe630, naggvars=1, arg=0x7fffffffe750) at /usr/src/cddl/lib/libdtrace/../../../cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume.c:2313
         #7  0x000000080088e6cf in dt_print_agg (aggdata=0x802690150, arg=0x7fffffffe750) at /usr/src/cddl/lib/libdtrace/../../../cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume.c:2361
         #8  0x0000000800895f8b in dt_aggregate_walk_sorted (dtp=0x802633000, func=0x80088e610 <dt_print_agg>, arg=0x7fffffffe750, sfunc=0x0)
             at /usr/src/cddl/lib/libdtrace/../../../cddl/contrib/opensolaris/lib/libdtrace/common/dt_aggregate.c:1585
         #9  0x0000000800895d39 in dtrace_aggregate_walk_sorted (dtp=0x802633000, func=0x80088e610 <dt_print_agg>, arg=0x7fffffffe750)
             at /usr/src/cddl/lib/libdtrace/../../../cddl/contrib/opensolaris/lib/libdtrace/common/dt_aggregate.c:1605
         #10 0x0000000800897f12 in dtrace_aggregate_print (dtp=0x802633000, fp=0x8014c37e8, func=0x800895d10 <dtrace_aggregate_walk_sorted>)
             at /usr/src/cddl/lib/libdtrace/../../../cddl/contrib/opensolaris/lib/libdtrace/common/dt_aggregate.c:2130
         #11 0x0000000000403a5e in main (argc=<optimized out>, argv=<optimized out>) at /usr/src/cddl/usr.sbin/dtrace/../../../cddl/contrib/opensolaris/cmd/dtrace/dtrace.c:2005
         (gdb) f 4
         #4  0x000000080088c190 in dt_print_llquantize (dtp=0x802633000, fp=0x8014c37e8, addr=0x80269a110, size=7840, normal=1) at /usr/src/cddl/lib/libdtrace/../../../cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume
         1083                       assert(value < next);
         (gdb) p step
         $2915 = 77662796314522419
         (gdb) p value
         $2916 = 7834326075677972872
         (gdb) p next
         $2917 = 7766279631452241920

It works as expected when replacing the 10 with a 5.

Various other parameter combinations work as expected as well and I've used
similar commands for weeks without issues.

The problem is reproducible with other execnames as long as the probe fires.

The "@m = max(arg0)" part isn't required to trigger the assertion but I only noticed it
after already patching the system where libdtrace is build with reduced optimizations.

The attached patch prevents the assertion from triggering but may not be the best solution.

The code flow in dt_print_llquantize() seems strange to me and maybe the
loop should break if "bin" reaches "last_bin" instead. My impression is that
it does a bunch of cycles at the end without doing meaningful work.

Obtained from: ElectroBSD
Comment 1 Mark Johnston freebsd_committer 2017-08-16 17:49:29 UTC
Slightly simpler repro that we can put in a test case:

# dtrace -q -n 'syscall:::return {@[execname] = llquantize(arg0, 100, 0, 10, 100);} tick-5s {exit(0);}'

Assertion failed: (value < next), file /mnt/ssd/markj/src/freebsd-dev/dev/graphics/cddl/contrib/opensolaris/lib/libdtrace/common/dt_aggregate.c, line 248.
Abort trap
Comment 2 Mark Johnston freebsd_committer 2017-08-21 21:04:53 UTC
It seems the real issue here is an overflow when computing 100^10? 9 is the largest power of 100 that fits in a 64-bit int.

There's a check that's supposed to catch this. Indeed, it does if we change the upper power from 10 to 11:

git (dev/vm-init) markj@wkstn-mjohnston> sudo dtrace -n 'syscall:::return {@[execname] = llquantize(arg0, 100, 0, 11, 100);} tick-1s {exit(0);}'
dtrace: invalid probe specifier syscall:::return {@[execname] = llquantize(arg0, 100, 0, 11, 100);} tick-1s {exit(0);}: llquantize( ) factor (100) raised to power of high magnitude (11) overflows 64-bits
Comment 3 Mark Johnston freebsd_committer 2017-08-21 21:26:23 UTC
Right, so there's an off-by-one in that check. However, fixing that isn't sufficient: we also hit an overflow with a maximum power of 9. That's because the upper bound ends up being base^{max+1}. For instance, with a 4-tuple (10, 0, 3, 10) we show buckets up to 10000, i.e., 10^4. So it's actually an off-by-two.
Comment 4 commit-hook freebsd_committer 2017-08-21 21:56:17 UTC
A commit references this bug:

Author: markj
Date: Mon Aug 21 21:56:03 UTC 2017
New revision: 322773
URL: https://svnweb.freebsd.org/changeset/base/322773

  Fix an off-by-two in the llquantize() action parameter validation.

  The aggregation created by llquantize() partitions values into buckets; the
  lower bound of the bucket containing the largest values is b^{m+1}, where
  b and m are the second and fourth parameters to the action, respectively.
  Bucket bounds are stored in a 64-bit integer, and so the llquantize()
  validation checks need to verify that b^{m+1} fits in 64 bits. However, it
  was only verifying that b^{m-1} fits in 64 bits, so certain parameter
  combinations could trigger assertion failures in libdtrace.

  PR:		219451
  MFC after:	1 week

Comment 5 Mark Johnston freebsd_committer 2017-08-30 02:10:42 UTC
MFCed to stable/11 in r323012.