Summary: | vmstat -z is truncated when stdout is not terminal | ||
---|---|---|---|
Product: | Base System | Reporter: | Jan Beich <jbeich> |
Component: | bin | Assignee: | Phil Shafer <phil> |
Status: | Closed FIXED | ||
Severity: | Affects Some People | CC: | allanjude, fk, markj, ngie, phil, rodrigc |
Priority: | --- | Keywords: | regression |
Version: | CURRENT | ||
Hardware: | Any | ||
OS: | Any | ||
Attachments: |
Description
Jan Beich
2016-01-11 10:01:53 UTC
Only happens with MALLOC_CONF=junk:false, see malloc(3) and MALLOC_PRODUCTION. I've noticed this too, but only on some of my systems. The output ends in the same place, i.e., after we start printing the 32KB malloc bucket stats. It does not occur if vmstat's stdout is a terminal. I haven't dug into this very deeply yet, but I noticed that removing the field width for the zone name makes the problem go away: 1528 xo_open_instance("zone"); 1529 xo_emit("{d:name/%-20s}{ke:name/%s} {:size/%6" PRIu64 "}, " 1530 "{:limit/%6" PRIu64 "},{:used/%8" PRIu64 "}," 1531 "{:free/%8" PRIu64 "},{:requests/%8" PRIu64 "}," 1532 "{:fail/%4" PRIu64 "},{:sleep/%4" PRIu64 "}\n", name, So this seems like a libxo bug. Created attachment 170546 [details]
vmstat: Prevent output truncation when piping zone statistics
The attached patch works around the issue by frequently calling xo_flush().
Adding phil@ for feedback on unexpected behavior with libxo not flushing quickly when the buffer is full. I'll take a look tonight. Thanks, Phil No luck reproducing this, even with MALLOC_CONF=junk:false. I'll dig more later tonight. What grn are you seeing this at? Thanks, Phil After reverting the patch again, I still see the problem with a system based on 11-CURRENT r300376. You could try running: while vmstat -z | wc -c; do sleep 60; done and wait until it prints 8192 which indicates a truncation. Usually the output stays the same for a while, so if you see 8192 once, there's a fair chance that you'll be able to reproduce the problem with a debugger attached. Yesterday I saw the truncation issue for more than 80 minutes until it went away again. Could you please send me the working and failing output? Thanks, Phil Created attachment 170683 [details]
File created with "script vmstat-working vmstat -z"
Created attachment 170684 [details]
File created with "script vmstat-not-working sh -c 'vmstat -z | cat'"
Was finally able to reproduce this and have a fix. I was using the wrong value when checking for available buffer space: diff --git a/libxo/libxo.c b/libxo/libxo.c index 9df67de..7a714b1 100644 --- a/libxo/libxo.c +++ b/libxo/libxo.c @@ -2816,7 +2816,7 @@ xo_format_string (xo_handle_t *xop, xo_buffer_t *xbp, xo_xff_flags_t flags, * but if we did the work ourselves, then we need to do it. */ int delta = xfp->xf_width[XF_WIDTH_MIN] - cols; - if (!xo_buf_has_room(xbp, delta)) + if (!xo_buf_has_room(xbp, xfp->xf_width[XF_WIDTH_MIN])) goto bail; /* In addition, I'm adding a high-water mark that will trigger a flush: @@ -6128,6 +6128,12 @@ xo_do_emit_fields (xo_handle_t *xop, xo_field_info_t *fields, XOIF_CLEAR(xop, XOIF_REORDER); + /* + * If we've got enough data, flush it. + */ + if (xo_buf_offset(&xop->xo_data) > XO_BUF_HIGH_WATER) + flush = 1; + /* If we don't have an anchor, write the text out */ if (flush && !XOIF_ISSET(xop, XOIF_ANCHOR)) { if (xo_write(xop) < 0) I'll roll a new version of libxo tomorrow. Apologies for taking so long with this, but it was mostly just an issue of reproducing it. I wasn't seeing enough churn in my "vmstat -z" value to trigger the issue, so I munged one of the regression tests to help. Thanks, Phil Fixed in r300925. Thanks, Phil |