Created attachment 252923 [details] patch that moves the bogus "Time moved forward" warning to debug Dovecot complains about time moving forward, which seems to be due to a broken mechanism (on FreeBSD) used to measure timeouts. This warning spams the maillog up to several hundred times per day. There's an ongoing thread about this issue in the freebsd forums: https://forums.freebsd.org/threads/dovecot-time-moved-forwards.82886 In post #33 RypPn points out the offending line in main.c and in post #34 msplsh suggests instead of completely removing/commenting out the line, it would be more sensible to move it from 'warning' to 'debug'. This is what this patch does: change the log facility to 'debug' to mute that bogus message for standard configurations, but keep it in 'debug' for anyone who might want to debug that issue in the future. I tested the patch as a local patch in poudriere and it builds fine on 13.3-RELEASE with the quarterly and latest branch.
Doesn't work for me, instead log fills with... master: Debug: Time moved forwards by 0.214156 seconds - adjusting timeouts. Earlier solution to comment out the offending block is superior.
(In reply to John R from comment #1) You are right, I'm really sorry. It seems I catched a brief period where this message wasn't triggered yesterday when I tested this, but the messages reappeared about two hours later... It seems dovecot just sends all "i_*" messages to the logfile; actual debugging is handled via e_debug events. Just changing from i_debug to e_debug doesn't work (I assume because event handlers have to be implemented differently from whatever an "i_*" is), but that's a bit out of my league... I also reverted to just commenting out lines 625-626 in main.c - I can attach the patch for that as well, unless someone wants to take a look how to properly move that message to debug?
Nothing to apologise for, this bug has been rumbling on since 2021 according to the forum and upstream class it as benign so I guess they won't fix, so the inelegant solution of hacking out the notice may be the only solution forthcoming.
On a side note, I have reverted the patch and am going to test with openntpd to see if the issue is really with dovecot or maybe the ntp daemon.
Created attachment 252941 [details] patch to completely mute the bogus message IIRC there already was a statement in an upstream discussion about not 'fixing' that because the messages are only informational. So yes, we might have to settle for the crude hack to get rid of the log spamming... So I attached a patch that just comments out the lines generating the message. For me it compiles, works as a local-patch as well as when added to the existing 'files/patch-src_master_main.c' in the port (again, tested for quarterly and latest on my 13.3-RELEASE buildhost).
Comments from Timo Sirainen (main dovecot author): Timo Sirainen via dovecot 4:38 AM (3 hours ago) to Larry, Dovecot The way Dovecot works is: - It finds the next timeout, sees that it happens in e.g. 5 milliseconds. - Then it calls kqueue() to wait for I/O for max 5 milliseconds - Then it notices that it actually returned more than 105 milliseconds later, and then logs a warning about it. So kqueue() apparently isn't very accurate in its timeout handling. With some googling I found https://lists.freebsd.org/pipermail/freebsd-arch/2012-March/012416.html which suggests this could happen at least if kern.hz is set to 20 or less. Could that be the case? I guess we could increase IOLOOP_TIME_MOVED_FORWARDS_MIN_USECS higher than 100 ms, but that might start hiding problems. Nowadays some people use rather short timeouts in e.g. some HTTP requests (auth, push-notifications). It could be difficult to understand why 100ms timeout happens only at 200ms without this warning message. Although if it happens only rarely, I guess it's not much of a problem. Anyway, would be good to understand first why this happens in FreeBSD before growing the warning time. Also, this is kind of a problem when it does happen. Since Dovecot thinks the time moved e.g. 100ms forward, it adjusts all timeouts to happen 100ms backwards. If this wasn't a true time jump, then these timeouts now happen 100ms earlier. So e.g. a HTTP request with <100ms timeout can actually trigger an immediate timeout. Hiding the log message makes debugging this also more difficult. So I don't think it's a good solution to simply hide it or change it to debug level, as it may mask real problems. Timo Sirainen via dovecot 5:05 AM (3 hours ago) to Larry, Dovecot On 21. Aug 2024, at 12.35, Timo Sirainen <timo@sirainen.com> wrote: > > The way Dovecot works is: > - It finds the next timeout, sees that it happens in e.g. 5 milliseconds. > - Then it calls kqueue() to wait for I/O for max 5 milliseconds > - Then it notices that it actually returned more than 105 milliseconds later, and then logs a warning about it. > > So kqueue() apparently isn't very accurate in its timeout handling. Actually another guess: Some people were saying it happens mainly on idle hours. Maybe kqueue() is accurate with low timeout values, but not accurate on high timeout values? So if Dovecot asked kqueue() to wait for <100ms, it would be very accurate. But if it asks to wait for 10000ms, kqueue() would think it's okay to return after 10100ms. If that's the case, this check could be changed to allow higher time jumps only on higher timeout waits.
Created attachment 253253 [details] Timo's patch That's Timo's patch to silence those warnings. I put that patch into mail/dovecot/files.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/ports/commit/?id=eae4d9b6062a7f15a533a00c173d7b66f91ec675 commit eae4d9b6062a7f15a533a00c173d7b66f91ec675 Author: Larry Rosenman <ler@FreeBSD.org> AuthorDate: 2024-09-01 18:04:00 +0000 Commit: Larry Rosenman <ler@FreeBSD.org> CommitDate: 2024-09-01 18:06:13 +0000 mail/dovecot: add patch from upstream to silence "time moved forwards" Patch from Timo Sirainen (via dovecot list) PR: 280929 Reported by: trashcan@ellael.org mail/dovecot/Makefile | 1 + mail/dovecot/files/patch-src-lib-ioloop.c (new) | 25 +++++++++++++++++++++++++ 2 files changed, 26 insertions(+)
(In reply to commit-hook from comment #8) Thanks. Now, let's keep our fingers crossed, that this is a fix for everyone, not only for me.
All quiet here, since I updated around seven hours ago, I haven't seen the "Time moved forward" message once.
(In reply to commit-hook from comment #8) I've been struggling with this issue for a long time. Thanks!
Also no "Time moved forward" log entries here on 2 servers for ~4 hours with that patch. THANK YOU to all involved in *finally* (and properly) fixing this! Any chance this patch/commit will also be merged to the quarterly branch (2024Q3)?
A commit in branch 2024Q3 references this bug: URL: https://cgit.FreeBSD.org/ports/commit/?id=1de999acff01a9c7d916c63522c6a447f4f6e487 commit 1de999acff01a9c7d916c63522c6a447f4f6e487 Author: Larry Rosenman <ler@FreeBSD.org> AuthorDate: 2024-09-01 18:04:00 +0000 Commit: Larry Rosenman <ler@FreeBSD.org> CommitDate: 2024-09-02 13:00:10 +0000 mail/dovecot: add patch from upstream to silence "time moved forwards" Patch from Timo Sirainen (via dovecot list) PR: 280929 Reported by: trashcan@ellael.org (cherry picked from commit eae4d9b6062a7f15a533a00c173d7b66f91ec675) mail/dovecot/Makefile | 1 + mail/dovecot/files/patch-src-lib-ioloop.c (new) | 25 +++++++++++++++++++++++++ 2 files changed, 26 insertions(+)