Bug 204741 - [patch] [feature-request] syslogd(8) should be able to protect itself from OOM killer
Summary: [patch] [feature-request] syslogd(8) should be able to protect itself from OO...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 10.2-STABLE
Hardware: Any Any
: --- Affects Some People
Assignee: Marcelo Araujo
URL:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2015-11-22 13:49 UTC by Eugene Grosbein
Modified: 2016-02-24 03:35 UTC (History)
2 users (show)

See Also:


Attachments
syslogd-protect.diff (2.30 KB, patch)
2015-11-22 13:49 UTC, Eugene Grosbein
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eugene Grosbein 2015-11-22 13:49:54 UTC
Created attachment 163419 [details]
syslogd-protect.diff

If the system exhausts available memory and swap, it starts to kill processes and report killed processes with syslog facility. syslogd(8) should not be killed or else system administrator has not complete information about the case. Practice shows that syslogd may be killed like any other process.

Let's teach syslogd to protect itself from OOM killer. The following patch introduces new command line option "-O" and system administrator is allowed to have a line in its /etc/rc.conf:

syslogd_flags="-O"

In this case, syslogd became protected.
Comment 1 Marcelo Araujo freebsd_committer freebsd_triage 2016-01-04 04:12:09 UTC
I would like to take a look on it.
Comment 2 Allan Jude freebsd_committer freebsd_triage 2016-01-04 05:24:52 UTC
I brought this up on IRC a few weeks ago. I think it might make sense for it to be protected by default, with an optional switch to turn it off? Or maybe rc.subr should grow the ability to just have xxx_protect="YES" for any service?
Comment 3 Marcelo Araujo freebsd_committer freebsd_triage 2016-01-04 05:36:26 UTC
(In reply to Allan Jude from comment #2)

I agree that it might be protected by default and I don't believe it is necessary to have a switch to ON/OFF it. IMHO, no one would like to see syslogd get killed by OOM.

I will come up soon, with something based on this patch.

Any new input, feel free to let me know.

Best,
Comment 4 Eugene Grosbein 2016-01-04 06:32:46 UTC
(In reply to Allan Jude from comment #2)

IMHO, it is nearly impossible to correctly implement such protection for arbitrary service by means of rc.subr, without modifications of service source code.

We do have protect(1) utility but it only can protect single process it starts or all current/future children of started process and for daemons both cases are wrong:

- protection of single process is meaningless because it forks to become daemon and that ceases protection;
- protection of all future children is dangerous for daemons capable of running arbitrary subprocesses like syslogd can with "vertical bar" syntax of syslog.conf(5) as those future children may be "runaway" memory hogs.
Comment 5 Eugene Grosbein 2016-01-04 06:43:46 UTC
(In reply to Marcelo Araujo from comment #3)

> I don't believe it is necessary to have a switch to ON/OFF it. IMHO, no one would like to see syslogd get killed by OOM.

I can imagine some kind of embedded routing system with very limited RAM. And its administrator may prefer OOM killer to kill all of sshd/telnetd/inetd/syslogd processes but not routing daemon in case of memory problems. I mean, it may prefer unprotected syslogd in preference of other needs.
Comment 6 Marcelo Araujo freebsd_committer freebsd_triage 2016-01-22 02:46:44 UTC
I have re-worked the patch and make the protection enabled by default with an option to disable it.

It is in review by other developers.

Best.
Comment 7 Marcelo Araujo freebsd_committer freebsd_triage 2016-02-03 17:47:08 UTC
Please, refer to a new patch: https://reviews.freebsd.org/D5176
Comment 8 Eugene Grosbein 2016-02-04 08:30:46 UTC
(In reply to Marcelo Araujo from comment #7)

I've just checked the patcj https://reviews.freebsd.org/file/data/gy33khn4sdehavsph7oj/PHID-FILE-baw4djr3pgfqkftp7y5a/D5176.diff

It does NOT solve the problem. For syslogd_oomprotect="YES" it just starts it with "protect syslogd" and, as I've already noted, syslogd forks to become a deamon and this ceases protection. That can be easily verified with commands:

# killall syslogd
# protect syslogd
# ps -axwwo pid,flags,flags2,command | grep syslogd
10614 10000000 00000000 /usr/sbin/syslogd

Second column shows "flags" attributes for a process and rightmost sixth digit is zero that means that P_PROTECTED flag is not set and syslogd is not protected, so the problem is not solved.

syslogd_oomprotect="ALL" starts it with "protect -i syslogd":

# killall syslogd
# protect syslogd
# ps -axwwo pid,flags,flags2,command | grep syslogd
10635 10100000 00000001 /usr/sbin/syslogd -ssvv

This time we see P_PROTECTED is really set but next column shows "flags2" attributes with a flag P2_INHERIT_PROTECTED being set also. That is wrong too, as syslogd may start arbitrary subprocesses that should not be protected.

With my original patch:

# killall syslogd
# syslogd -O
# ps -axwwo pid,flags,flags2,command | grep syslogd
10678 10100000 00000000 /usr/sbin/syslogd -O

Now the problem is solved correctly: P_PROTECTED is set and P2_INHERIT_PROTECTED is not. That's because my patch protects syslogd after it forked. Currently that cannot be accomplished by means of rc.subr only, without daemon source code modifications.
Comment 9 Eugene Grosbein 2016-02-04 08:48:53 UTC
(In reply to Marcelo Araujo from comment #7)

By the way, we already have several stock daemons that protect themselves:

# ps -axwwo pid,flags,flags2,command | awk '$2 !~ /0.....$/'
  PID        F       F2 COMMAND
 1459 10100000 00000000 /usr/sbin/watchdogd
 1721 10100100 00000000 /usr/sbin/sshd
 1755 10100000 00000000 /usr/sbin/cron -s
 1813 10100000 00000000 /usr/sbin/inetd -wWl -R 200000
10678 10100000 00000000 /usr/sbin/syslogd -O
Comment 10 Allan Jude freebsd_committer freebsd_triage 2016-02-04 15:36:59 UTC
I would suggest we commit both. Give syslog the in-daemon protection, and offer the rc.subr based protection for other daemons.
Comment 11 Allan Jude freebsd_committer freebsd_triage 2016-02-04 15:38:02 UTC
For 11, we might also consider making a new flag for the inheritance, that only allows the protection to be inherited 'once'. So syslogd forks the daemon, which inherits the protection, but not the flag to make its children inherit the proteciton
Comment 12 Eugene Grosbein 2016-02-04 16:11:39 UTC
(In reply to Allan Jude from comment #11)

Some processes fork twice to become new session leaders. Some daemons may fork once and their children should be protected but not grandchildren that supposed to exec arbitrariy commands.

Decrementing fork counter and "protect -n $count /path/to/daemon" would cover all such cases better.
Comment 13 Marcelo Araujo freebsd_committer freebsd_triage 2016-02-04 16:13:34 UTC
How would be possible decrements an arbitrary fork counter?
Comment 14 Marcelo Araujo freebsd_committer freebsd_triage 2016-02-04 16:16:35 UTC
(In reply to Allan Jude from comment #10)
Yes, it can be one solution, but protect(1) maybe has an issue. I'm still investigating it. But also there is another way to set the protect after launch the processes but seemed not clean to me to have it in that way on rc.subr, but it will work.
Comment 15 Allan Jude freebsd_committer freebsd_triage 2016-02-04 16:20:15 UTC
(In reply to Marcelo Araujo from comment #14)
It might be possible to get the pid of the service once it has actually started, and 'protect -p' it, but that will likely take a bit more work.
Comment 16 Marcelo Araujo freebsd_committer freebsd_triage 2016-02-04 16:23:18 UTC
(In reply to Allan Jude from comment #15)
Exactly! And that works fine. At first time a did make a code using that approach, but after reading the man page, I realize that protect(1) should support it just using protect <processes>. I'm trying to figure out if there is any other way to do it a bit more clean, or changing protect(1) or in the latest case, using this approach launch and get the PID.
Comment 17 Eugene Grosbein 2016-02-04 16:30:25 UTC
(In reply to Allan Jude from comment #15)

rc.subr has check_pidfile function that can get PID to pass to protect(1) provided correctly set $daemonname_pidfile variable. Still looks a bit ugly.
Comment 18 Eugene Grosbein 2016-02-04 16:36:58 UTC
(In reply to Allan Jude from comment #15)

And there is check_process function in rc.subr to find all pids if pidfile is unknown:

if [ -n "${pidfile}" ] && rc_pid=$(check_pidfile $pidfile $command); then
else rc_pid=$(check_process $command)
Comment 19 Marcelo Araujo freebsd_committer freebsd_triage 2016-02-05 06:01:30 UTC
I have updated the patch and now I apply protect(1) over the PID.
So feel free to test and give me any feedback or concern you guys have with this approach.
Comment 20 Eugene Grosbein 2016-02-05 08:03:41 UTC
Call to check_process should not be non-conditional. It should not be called if _oomprotect is not set. And it should not be called if _pidfile is set and more lightweight check_pidfile is enough.
Comment 21 Eugene Grosbein 2016-02-05 11:16:31 UTC
(In reply to Marcelo Araujo from comment #19)

https://reviews.freebsd.org/file/data/socvncgqw5yqjriemraj/PHID-FILE-hxt4nuhq22mbzd4ehm5k/D5176.vson.id13048.whitespaceignore-most.diff looks almost fine. But it does not utilize lightweight check_pidfile and always runs check_process and that would be slower.
Comment 22 commit-hook freebsd_committer freebsd_triage 2016-02-24 01:33:02 UTC
A commit references this bug:

Author: araujo
Date: Wed Feb 24 01:32:13 UTC 2016
New revision: 295949
URL: https://svnweb.freebsd.org/changeset/base/295949

Log:
  - Add a global option where we can protect processes when swap space
    is exhausted.

  How to use:

  Basically we need to add on rc.conf an another option like:

      If we want to protect only the main processes.
      syslogd_oomprotect="YES"

      If we want to protect all future children of the specified processes.
      syslogd_oomprotect="ALL"

  PR:		204741 (based on)
  Submitted by:	eugen@grosbein.net
  Reviewed by:	jhb, allanjude, rpokala and bapt
  MFC after:	4 weeks
  Relnotes:	Yes
  Sponsored by:	gandi.net
  Differential Revision:	https://reviews.freebsd.org/D5176

Changes:
  head/etc/defaults/rc.conf
  head/etc/rc.subr
  head/share/man/man8/rc.subr.8
Comment 23 Marcelo Araujo freebsd_committer freebsd_triage 2016-02-24 03:35:28 UTC
I didn't added the protection direct on syslogd, however I added a global option.
In all discussion, seemed ok have the global option for all daemons.

Thanks for the patch.

Best,