Bug 224135 - [net-mgmt/netdata] Spurious errors regarding PID 0 logged every second
Summary: [net-mgmt/netdata] Spurious errors regarding PID 0 logged every second
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Many People
Assignee: Mahdi Mokhtari
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-12-05 22:51 UTC by nhoyle
Modified: 2018-01-18 14:17 UTC (History)
0 users

See Also:
mmokhi: maintainer-feedback+


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description nhoyle 2017-12-05 22:51:09 UTC
When the apps.plugin plugins are enabled (as is common), netdata logs an error every single poll (1 second by default) to /var/log/netdata/error.log in regards to PID 0 [kernel], like such:

2017-12-05 17:13:11: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:12: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:13: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:14: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:15: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:16: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:17: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:18: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:19: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:20: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:21: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:22: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:23: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:24: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:25: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:26: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:27: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:28: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.

This appears to be based on a Linux-centric assumption that 1 will be the lowest observed PID. Because of the logging rate, this quickly triggers the following as well:

2017-12-05 17:13:29: apps.plugin Too many logs (101 logs in 100 seconds, threshold is set to 100 logs in 3600 seconds). Preventing more logs from process 'apps.plugin' for 3500 seconds.

Which may result in more valuable logs being suppressed. I have traced this flawed check to the "collect_data_for_pid" function in apps_plugin.c, with the following code:

void collect_data_for_pid(pid_t pid) {
    if(unlikely(pid <= 0 || pid > pid_max)) {
        error("Invalid pid %d read (expected 1 to %d). Ignoring process.", pid, pid_max);
        return;
    }

I have reported this issue to the upstream github project at https://github.com/firehol/netdata/issues/3099, but a FreeBSD port fix would be welcome until this is addressed upstream.
Comment 1 nhoyle 2018-01-13 01:37:39 UTC
New upstream fix: https://github.com/firehol/netdata/pull/3276
Comment 2 commit-hook freebsd_committer 2018-01-18 12:25:35 UTC
A commit references this bug:

Author: mmokhi
Date: Thu Jan 18 12:25:11 UTC 2018
New revision: 459323
URL: https://svnweb.freebsd.org/changeset/ports/459323

Log:
  net-mgmt/netdata: Fix wrong PID assumption on FreeBSD.
  The issue merged on upstream but not in release-tree yet
  Also add two other dependencies for run-time.

  PR:		224135
  Reported by:	nhoyle@hoyletech.com
  Sponsored by:	Netzkommune GmbH

Changes:
  head/net-mgmt/netdata/Makefile
  head/net-mgmt/netdata/files/patch-fixes-issue-3276-upstream
Comment 3 Mahdi Mokhtari freebsd_committer freebsd_triage 2018-01-18 12:27:39 UTC
Committed.
Thanks everyone :)
Comment 4 nhoyle 2018-01-18 14:12:59 UTC
(In reply to Mahdi Mokhtari from comment #3)

Built and installed the patched version this morning. Initial testing confirms that the originally reported issue is indeed resolved, and no immediately obvious regressions or new issues. Thanks for getting the upstream fix made available so quickly.
Comment 5 Mahdi Mokhtari freebsd_committer freebsd_triage 2018-01-18 14:17:17 UTC
Thanks for confirming :)
And sorry if the commit on this was not as fast as it could be (Had some busy weeks behind `:D)