Bug 254252 - net-mgmt/nagios-plugins check_procs wrong CPU matching [idle] thread with --metric=CPU
Summary: net-mgmt/nagios-plugins check_procs wrong CPU matching [idle] thread with --m...
Status: New
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: Jochen Neumeister
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-03-13 08:22 UTC by Volodymyr Pushkar
Modified: 2024-03-04 17:24 UTC (History)
1 user (show)

See Also:


Attachments
patch-plugins_check__procs.c (482 bytes, patch)
2021-03-13 08:22 UTC, Volodymyr Pushkar
vladimir.pushkar: maintainer-approval? (mat)
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Volodymyr Pushkar 2021-03-13 08:22:11 UTC
Created attachment 223224 [details]
patch-plugins_check__procs.c

When checking for processes using a lot of CPU check_procs match [idle] thread and giving wrong state. For example:

>/usr/local/libexec/nagios/check_procs -v -w 70 -c 90 --metric=CPU
CPU CRITICAL: 1 crit, 0 warn out of 75 processes [idle] | procs=75;;;0; procs_warn=0;;;0; procs_crit=1;;;0; procpcpu=399.899994;

Proposed patch will skip idle.
Comment 1 Mathieu Arnold freebsd_committer freebsd_triage 2021-06-07 07:03:20 UTC
Sorry for taking this long to get to this patch.

I am sorry but I don't understand what the patch actually does.
Comment 2 Mike Walker 2021-11-29 14:10:06 UTC
The issue is that without the included patch, check_procs will emit WARNING and CRITICAL for the system "idle" process.

For example on one of my servers, "top" lists the idle process as using 354% of the CPU:


    # top -SCb | egrep '(COMMAND|idle)$'
    CPU:  3.9% user,  0.2% nice,  1.9% system,  0.3% interrupt, 93.7% idle
    PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME     CPU COMMAND
     11 root          4 155 ki31     0B    64K CPU0     0 1797.0 354.88% idle


And because of this, the "idle" system process will be flagged as having too much CPU time by "check_procs" if "--metric=CPU" is passed, like this:


    # /usr/local/libexec/nagios/check_procs -v -w 100 -c 105 --metric=CPU
CPU CRITICAL: 1 crit, 0 warn out of 110 processes [idle] | procs=110;;;0; procs_warn=0;;;0; procs_crit=1;;;0; procpcpu=410.899994;