Bug 224135

Summary: [net-mgmt/netdata] Spurious errors regarding PID 0 logged every second
Product: Ports & Packages Reporter: Nathanael Hoyle <nhoyle>
Component: Individual Port(s)Assignee: Mahdi Mokhtari <mmokhi>
Status: Closed FIXED    
Severity: Affects Many People Flags: mmokhi: maintainer-feedback+
Priority: ---    
Version: Latest   
Hardware: Any   
OS: Any   

Description Nathanael Hoyle 2017-12-05 22:51:09 UTC
When the apps.plugin plugins are enabled (as is common), netdata logs an error every single poll (1 second by default) to /var/log/netdata/error.log in regards to PID 0 [kernel], like such:

2017-12-05 17:13:11: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:12: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:13: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:14: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:15: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:16: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:17: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:18: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:19: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:20: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:21: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:22: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:23: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:24: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:25: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:26: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:27: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.
2017-12-05 17:13:28: apps.plugin ERROR: Invalid pid 0 read (expected 1 to 99999). Ignoring process.

This appears to be based on a Linux-centric assumption that 1 will be the lowest observed PID. Because of the logging rate, this quickly triggers the following as well:

2017-12-05 17:13:29: apps.plugin Too many logs (101 logs in 100 seconds, threshold is set to 100 logs in 3600 seconds). Preventing more logs from process 'apps.plugin' for 3500 seconds.

Which may result in more valuable logs being suppressed. I have traced this flawed check to the "collect_data_for_pid" function in apps_plugin.c, with the following code:

void collect_data_for_pid(pid_t pid) {
    if(unlikely(pid <= 0 || pid > pid_max)) {
        error("Invalid pid %d read (expected 1 to %d). Ignoring process.", pid, pid_max);
        return;
    }

I have reported this issue to the upstream github project at https://github.com/firehol/netdata/issues/3099, but a FreeBSD port fix would be welcome until this is addressed upstream.
Comment 1 Nathanael Hoyle 2018-01-13 01:37:39 UTC
New upstream fix: https://github.com/firehol/netdata/pull/3276
Comment 2 commit-hook freebsd_committer freebsd_triage 2018-01-18 12:25:35 UTC
A commit references this bug:

Author: mmokhi
Date: Thu Jan 18 12:25:11 UTC 2018
New revision: 459323
URL: https://svnweb.freebsd.org/changeset/ports/459323

Log:
  net-mgmt/netdata: Fix wrong PID assumption on FreeBSD.
  The issue merged on upstream but not in release-tree yet
  Also add two other dependencies for run-time.

  PR:		224135
  Reported by:	nhoyle@hoyletech.com
  Sponsored by:	Netzkommune GmbH

Changes:
  head/net-mgmt/netdata/Makefile
  head/net-mgmt/netdata/files/patch-fixes-issue-3276-upstream
Comment 3 Mahdi Mokhtari freebsd_committer freebsd_triage 2018-01-18 12:27:39 UTC
Committed.
Thanks everyone :)
Comment 4 Nathanael Hoyle 2018-01-18 14:12:59 UTC
(In reply to Mahdi Mokhtari from comment #3)

Built and installed the patched version this morning. Initial testing confirms that the originally reported issue is indeed resolved, and no immediately obvious regressions or new issues. Thanks for getting the upstream fix made available so quickly.
Comment 5 Mahdi Mokhtari freebsd_committer freebsd_triage 2018-01-18 14:17:17 UTC
Thanks for confirming :)
And sorry if the commit on this was not as fast as it could be (Had some busy weeks behind `:D)