Multiple people have reported to the SaltStack project that salt hangs indefinitely when trying to restart dovecot.[1] The underlying issue can be trivially reproduced directly with Python: python2.7 -c 'import subprocess; subprocess.Popen(["/usr/sbin/service", "dovecot", "onerestart"], stderr=subprocess.PIPE).communicate()' While python is waiting, you can use ps to see that the child process is a zombie. Unless and until dovecot cleans up their file descriptors,[2] it might be wise to update the rc script to use /usr/sbin/daemon and pass -F to dovecot. [1] https://github.com/saltstack/salt/issues/44848 [2] https://github.com/dovecot/core/blob/master/src/master/main.c#L610
Please, correct me if I'm wrong, but using daemon(8) would prevent dovecot from doing graceful restarts and receiving reload HUPs. I agree it's a problem, but I wouldn't want to limit the functionality of dovecot for everybody to work around bad behaviour with py-salt. If this is fixed upstream, I'm very happy to merge that change right away.
OK, ler pointed out that pretty much everything aside from my name in my previous comment is incorrect. It'll be about 2 weeks until I can work on this, unless you can put together a dovecot.in patch in the meantime.
Created attachment 193070 [details] Launch dovecot with daemon(8) My practical experience with rc and daemon(8) is fairly limited, but I believe this is the least invasive change that resolves the issue (plus a minor documentation fix). It could use the eyes of experience to confirm. I tested both routine operation and fscd compatibility. I also checked dovecot's src/master/main.c to make sure the -F flag doesn't have any sneaky side effects. Thanks
It looks like your patch is against the dovecot version in the quarterly branch. Do you know whether this bug persists in Dovecot 2.3.1, in head?
I was mostly working off an install from my usual portsnap poudriere builds and just used the svn tree to get the diff and do a final test. Apparently I had switched that one to a quarterly branch at some point and forgot. I re-checked on svn head and got the same behavior with and without the fix. Also, here's where the upstream code is still failing to close stderr: https://github.com/dovecot/core/blob/master/src/master/main.c#L860 I should probably send them a pull request as well. Thanks
Created attachment 193071 [details] Alternate: source patch to close stderr Or this source patch also gets it done. Either way, really.
Over to new maintainer
Peter: Can you send a Pull-Request, and see what Timo and company say, please?
https://github.com/dovecot/core/pull/82 (May 5). Someone suggested that this would break integration with daemontools, but I'm pretty sure daemontools requires that processes not self-daemonize anyway.
Can something be integrated to the package prior to full upstream patching (if that happens at all)?
Give me a patch against 2.3.3 (not in ports yet) and I'll see what I can do. I'm waiting for the corresponding Pigeonhole release before updating both.
Created attachment 197698 [details] Updatead daemontools patch for 2.3.3 Updated patch for 2.3.3. I updated Makefile version to 2.3.3, did a makesum, generated package and tested completely.
committed the source patch. Thanks!
A commit references this bug: Author: ler Date: Mon Oct 1 23:18:31 UTC 2018 New revision: 481076 URL: https://svnweb.freebsd.org/changeset/ports/481076 Log: mail/dovecot upgrade to 2.3.3, mail/dovecot-pigeonhole upgrade to 0.5.3. dovecot changelog: * doveconf hides more secrets now in the default output. * ssl_dh setting is no longer enforced at startup. If it's not set and non-ECC DH key exchange happens, error is logged and client is disconnected. + Added log_debug=<filter> setting. + Added log_core_filter=<log filter> setting. + quota-clone: Write to dict asynchronously + --enable-hardening attempts to use retpoline Spectre 2 mitigations + lmtp proxy: Support source_ip passdb extra field. + doveadm stats dump: Support more fields and output stddev by default. + push-notification: Add SSL support for OX backend. - NUL bytes in mail headers can cause truncated replies when fetched. - director: Conflicting host up/down state changes may in some rare situations ended up in a loop of two directors constantly overwriting each others' changes. - director: Fix hang/crash when multiple doveadm commands are being handled concurrently. - director: Fix assert-crash if doveadm disconnects too early - virtual plugin: Some searches used 100% CPU for many seconds - dsync assert-crashed with acl plugin in some situations. - mail_attachment_detection_options=add-flags-on-save assert-crashed with some specific Sieve scripts. - Mail snippet generation crashed with mails containing invalid Content-Type:multipart header. - Log prefix ordering was different for some log lines. - quota: With noenforcing option current quota usage wasn't updated. - auth: Kerberos authentication against Samba assert-crashed. - stats clients were unnecessarily chatty with the stats server. - imapc: Fixed various assert-crashes when reconnecting to server. - lmtp, submission: Fix potential crash if client disconnects while handling a command. - quota: Fixed compiling with glibc-2.26 / support libtirpc. - fts-solr: Empty search values resulted in 400 Bad Request errors - fts-solr: default_ns parameter couldn't be used - submission server crashed if relay server returned over 7 lines in a reply (e.g. to EHLO) pigeonhole changelog: - Fix assertion panic occurring when managesieve service fails to open INBOX while saving a Sieve script. This was caused by a lack of cleanup after failure. - Fix specific messages causing an assert panic with actions that compose a reply (e.g. vacation). With some rather weird input from the original message, the header folding algorithm (as used for composing the References header for the reply) got confused, causing the panic. - IMAP FILTER=SIEVE capability: Fix FILTER SIEVE SCRIPT command parsing. After finishing reading the Sieve script, the command parsing sometimes didn't continue with the search arguments. This is a time- critical bug that likely only occurs when the Sieve script is sent in the next TCP frame. Add a patch to close stderr in src/master/main.c, should fix PR 228003. PR: 228003 Submitted by: psagers@ignorare.net Changes: head/mail/dovecot/Makefile head/mail/dovecot/distinfo head/mail/dovecot/files/patch-src_master_main.c head/mail/dovecot/pkg-plist head/mail/dovecot-pigeonhole/Makefile head/mail/dovecot-pigeonhole/distinfo
This has resurfaced after https://svnweb.freebsd.org/ports?view=revision&revision=512138 was applied, removing the patch that solved it. I think that this patch is needed, as upstream didn't change anything.