Created attachment 239046 [details] Diff to add shutdown_delay to daemon utility Currently if daemon is supervising a process and it receives SIGTERM it sends SIGTERM to the supervised process and then immediately exits. This has a couple potential issues. 1. If the daemon process is writing the stdout/stderr of the process to a log file or syslog it can miss messages that are generated after the supervised process receives SIGTERM. 2. If the daemon process is maintaining a PID file for the supervised process it can get removed before the supervised process actually exits. This can cause issues with processes that expect the PID file to be there while the previously supervised process is running. This patch adds a -d <shutdown_delay> option to daemon. If daemon receives SIGTERM it will send SIGTERM to the supervised process and then continue collecting and outputting stdout/stderr. After shutdown_delay seconds if the process hasn't already exited it will send SIGKILL to the process and then wait for it to exit. This will potentially cause the daemon process to hang around indefinitely if the supervised process is wedged somewhere deep in the kernel with signals blocked, but this way it will properly keep the PID file around while the supervised process is running.
I am working on daemon improvements and this feature is planned to be added once transition to kqueue is finished.
I think it would make sense to have a shutdown delay, whenever there's a restart delay. Which under supervision is at least one second so i don't think we need a new parameter. What'd y'all think?
Nathan's motivation for the shutdown delay is to avoid race conditions caused by asynchronous signal processing. After https://reviews.freebsd.org/rG8935a3993219be76c7ea03e9ad4509657d08af6c was merged none of the described situations are possible. 1. Upon receiving SIGTERM daemon will propagate it to the child, then wait for child to exit and then read everything until EOF from the pipe. 2. Same with the pid file: Child's pidfile cleanup happens only after child reported SIGCHILD, all the data was read from the pipe. At this point I do not see a compelling usecase for shutdown delay feature.
I think the only thing the kqueue version doesn't do that my patch does is send SIGKILL to the supervised process after the delay. I see there is a TODO in the kqueue version to do something like that, but it isn't implemented.
We can add sleep before sending SITERM to the child, but why is this needed?
Sorry, disregard my previous comment. SIGKILL after delay is planned. I will use EVFILT_TIMER for this.
I'm going to close this as Ihor's changes should cover everything and switching to kqueue is better than the current signal handling.