Bug 221518 - net-mgmt/smokeping: smokeping process changes its name
Summary: net-mgmt/smokeping: smokeping process changes its name
Status: Closed Unable to Reproduce
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: Rodrigo Osorio
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-08-14 10:47 UTC by Kajetan Staszkiewicz
Modified: 2018-12-28 14:07 UTC (History)
0 users

See Also:
bugzilla: maintainer-feedback? (rodrigo)


Attachments
rc script using daemon(8) as a wrapper (2.04 KB, application/x-shellscript)
2017-12-20 14:50 UTC, Kajetan Staszkiewicz
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Kajetan Staszkiewicz 2017-08-14 10:47:54 UTC
As smokeping runs as a deamon, at some point it looses its process name. This means that check_pidfile won't be able to find the PID of the main process (even the 2nd call in smokeping_check_pidfile, the one with $command_interpreter). So smokeping is considered dead by `service smokeping status` and my Puppet attempts to start a new one. The fix from bug 221009 is not enough in such cases.

I found what happens by running `while [ true ]; do; date; ps up `pgrep -u smokeping`; sleep 1; done` for a few days. Below you can see the transition happening. The PID is the same, the process is not restarted. It just looses the command line.


Fri Aug  4 12:44:45 UTC 2017
USER        PID %CPU %MEM    VSZ   RSS TT  STAT STARTED    TIME COMMAND
smokeping 17248  0.0  0.0  12364  1544  -  S    12:44PM 0:00.00 /usr/bin/fping -C 9 -q -B1 -r1 -i10 -p6000 A.A.A.A B.B.B.B C.C.C.C
smokeping 45265  0.0  0.2 210384 14128  -  Is    2:20PM 0:00.01 /usr/local/bin/perl /usr/local/bin/smokeping --master-url=http://XXX:8080/ --cache-dir=/usr/local/var/smokeping/ --shared-secr
smokeping 45266  0.0  0.3 210384 16620  -  S     2:20PM 0:07.75 /usr/local/bin/smokeping [FPing] (perl)

Fri Aug  4 12:44:46 UTC 2017
USER        PID %CPU %MEM    VSZ   RSS TT  STAT STARTED    TIME COMMAND
smokeping 17248  0.0  0.0  12364  1544  -  S    12:44PM 0:00.00 /usr/bin/fping -C 9 -q -B1 -r1 -i10 -p6000 A.A.A.A B.B.B.B C.C.C.C
smokeping 45265  0.0  0.2 210384 14128  -  Is    2:20PM 0:00.01 /usr/local/bin/perl /usr/local/bin/smokeping --master-url=http://XXX:8080/ --cache-dir=/usr/local/var/smokeping/ --shared-secr
smokeping 45266  0.0  0.3 210384 16620  -  S     2:20PM 0:07.75 /usr/local/bin/smokeping [FPing] (perl)

Fri Aug  4 12:44:47 UTC 2017
USER        PID %CPU %MEM    VSZ  RSS TT  STAT STARTED    TIME COMMAND
smokeping 17248  0.0  0.0  12364 1320  -  S    12:44PM 0:00.00 /usr/bin/fping -C 9 -q -B1 -r1 -i10 -p6000 A.A.A.A B.B.B.B C.C.C.C
smokeping 45265  0.0  0.0 210384    0  -  IWs  -       0:00.00 (perl)
smokeping 45266  0.0  0.1 210384 9284  -  S     2:20PM 0:07.75 /usr/local/bin/smokeping [FPing] (perl)

Fri Aug  4 12:44:48 UTC 2017
USER        PID %CPU %MEM    VSZ  RSS TT  STAT STARTED    TIME COMMAND
smokeping 17248  0.0  0.0  12364 1320  -  S    12:44PM 0:00.00 /usr/bin/fping -C 9 -q -B1 -r1 -i10 -p6000 A.A.A.A B.B.B.B C.C.C.C
smokeping 45265  0.0  0.0 210384    0  -  IWs  -       0:00.00 (perl)
smokeping 45266  0.0  0.1 210384 9284  -  S     2:20PM 0:07.75 /usr/local/bin/smokeping [FPing] (perl)

Is loosing command line arguments intrinsic property of processes swapped out? Being swapped out is the thing that changed for this process at some point. Also it seems that when I managed to unswap the process, it recovered its process name:

[root@XXXX ~]% ps up 45265                   
USER        PID %CPU %MEM    VSZ RSS TT  STAT STARTED    TIME COMMAND
smokeping 45265  0.0  0.0 210384   0  -  IWs  -       0:00.00 (perl)

[root@XXXX ~]% kill -HUP 45265

[root@XXXX~]% ps up 45265    
USER        PID %CPU %MEM    VSZ  RSS TT  STAT STARTED    TIME COMMAND
smokeping 45265  0.0  0.1 210384 8160  -  DLs   3Aug17 0:00.02 /usr/local/bin/perl /usr/local/bin/smokeping --master-url=http://XXX:8080/ --cache-dir=/usr/local/var/smokeping/ --shared-secre

That would mean that we must trust the PID in /usr/local/var/smokeping/pid and not perform any additional checks on commandline or interpreter. Other option would be to not use daemonizing within smokeping and use daemon(8) instead and use its pid for management.
Comment 1 Rodrigo Osorio freebsd_committer freebsd_triage 2017-10-19 22:16:31 UTC
Hi,

Maybe I'm wrong but 45266 is the smokeping daemon, right ?
And I don't see any changes over the time. If you still
falling in this issue perform a ps command with the ppid,
to see if we are not tracking a fork.
Until that I have a running smokeping and I perform regular checks
on ps and the pid.

Cheers
Comment 2 Kajetan Staszkiewicz 2017-10-23 14:33:40 UTC
I'm tracking the master process. Since it is not a bug in smokeping itself, I opened another ticket 222147 for rc.subr because this situation can happen to any daemon which gets swapped out.

I don't understand what do you mean by "And I don't see any changes over the time.". Look at each ps output. Main process changes its name at one point.
Comment 3 Kajetan Staszkiewicz 2017-12-20 14:50:36 UTC
Created attachment 188995 [details]
rc script using daemon(8) as a wrapper
Comment 4 Kajetan Staszkiewicz 2017-12-20 14:55:44 UTC
I'm attaching a modified rc script which instead of using native daemonization of Smokeping uses daemon(8) which itself runs with shorter command line and thus should not be prone to the issue described in PR #222147.

The script is not well tested, only for slave mode and I know that the part for reloading does not really work. If you believe this is a way to go, I can further develop it.