Bug 280929 - mail/dovecot move bogus warning "Time moved forwards" to debug
Summary: mail/dovecot move bogus warning "Time moved forwards" to debug
Status: In Progress
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: Larry Rosenman
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-08-19 16:07 UTC by Sebastian
Modified: 2024-09-02 13:00 UTC (History)
5 users (show)

See Also:
bugzilla: maintainer-feedback? (ler)


Attachments
patch that moves the bogus "Time moved forward" warning to debug (293 bytes, patch)
2024-08-19 16:07 UTC, Sebastian
no flags Details | Diff
patch to completely mute the bogus message (377 bytes, patch)
2024-08-20 08:15 UTC, Sebastian
no flags Details | Diff
Timo's patch (1.06 KB, patch)
2024-09-01 17:53 UTC, Michael Grimm
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sebastian 2024-08-19 16:07:29 UTC
Created attachment 252923 [details]
patch that moves the bogus "Time moved forward" warning to debug

Dovecot complains about time moving forward, which seems to be due to a broken mechanism (on FreeBSD) used to measure timeouts. This warning spams the maillog up to several hundred times per day.

There's an ongoing thread about this issue in the freebsd forums:
https://forums.freebsd.org/threads/dovecot-time-moved-forwards.82886

In post #33 RypPn points out the offending line in main.c and in post #34 msplsh suggests instead of completely removing/commenting out the line, it would be more sensible to move it from 'warning' to 'debug'.
This is what this patch does: change the log facility to 'debug' to mute that bogus message for standard configurations, but keep it in 'debug' for anyone who might want to debug that issue in the future.

I tested the patch as a local patch in poudriere and it builds fine on 13.3-RELEASE with the quarterly and latest branch.
Comment 1 John R 2024-08-19 21:44:07 UTC
Doesn't work for me, instead log fills with...

master: Debug: Time moved forwards by 0.214156 seconds - adjusting timeouts.

Earlier solution to comment out the offending block is superior.
Comment 2 Sebastian 2024-08-20 07:16:48 UTC
(In reply to John R from comment #1)

You are right, I'm really sorry. It seems I catched a brief period where this message wasn't triggered yesterday when I tested this, but the messages reappeared about two hours later...

It seems dovecot just sends all "i_*" messages to the logfile; actual debugging is handled via e_debug events. Just changing from i_debug to e_debug doesn't work (I assume because event handlers have to be implemented differently from whatever an "i_*" is), but that's a bit out of my league...

I also reverted to just commenting out lines 625-626 in main.c - I can attach the patch for that as well, unless someone wants to take a look how to properly move that message to debug?
Comment 3 John R 2024-08-20 08:05:52 UTC
Nothing to apologise for, this bug has  been rumbling on since 2021 according to the forum and upstream class it as benign so I guess they won't fix, so the inelegant solution of hacking out the notice may be the only solution forthcoming.
Comment 4 John R 2024-08-20 08:14:38 UTC
On a side note, I have reverted the patch and am going to test with openntpd to see if the issue is really with dovecot or maybe the ntp daemon.
Comment 5 Sebastian 2024-08-20 08:15:46 UTC
Created attachment 252941 [details]
patch to completely mute the bogus message

IIRC there already was a statement in an upstream discussion about not 'fixing' that because the messages are only informational. So yes, we might have to settle for the crude hack to get rid of the log spamming...

So I attached a patch that just comments out the lines generating the message.
For me it compiles, works as a local-patch as well as when added to the existing 'files/patch-src_master_main.c' in the port (again, tested for quarterly and latest on my 13.3-RELEASE buildhost).
Comment 6 Larry Rosenman freebsd_committer freebsd_triage 2024-08-21 13:21:53 UTC
Comments from Timo Sirainen (main dovecot author):

Timo Sirainen via dovecot
4:38 AM (3 hours ago)
to Larry, Dovecot

The way Dovecot works is:
 - It finds the next timeout, sees that it happens in e.g. 5 milliseconds.
 - Then it calls kqueue() to wait for I/O for max 5 milliseconds
 - Then it notices that it actually returned more than 105 milliseconds later, and then logs a warning about it.

So kqueue() apparently isn't very accurate in its timeout handling.

With some googling I found https://lists.freebsd.org/pipermail/freebsd-arch/2012-March/012416.html which suggests this could happen at least if kern.hz is set to 20 or less. Could that be the case?

I guess we could increase IOLOOP_TIME_MOVED_FORWARDS_MIN_USECS higher than 100 ms, but that might start hiding problems. Nowadays some people use rather short timeouts in e.g. some HTTP requests (auth, push-notifications). It could be difficult to understand why 100ms timeout happens only at 200ms without this warning message. Although if it happens only rarely, I guess it's not much of a problem.

Anyway, would be good to understand first why this happens in FreeBSD before growing the warning time.

Also, this is kind of a problem when it does happen. Since Dovecot thinks the time moved e.g. 100ms forward, it adjusts all timeouts to happen 100ms backwards. If this wasn't a true time jump, then these timeouts now happen 100ms earlier. So e.g. a HTTP request with <100ms timeout can actually trigger an immediate timeout. Hiding the log message makes debugging this also more difficult. So I don't think it's a good solution to simply hide it or change it to debug level, as it may mask real problems.


Timo Sirainen via dovecot
5:05 AM (3 hours ago)
to Larry, Dovecot

On 21. Aug 2024, at 12.35, Timo Sirainen <timo@sirainen.com> wrote:
>
> The way Dovecot works is:
> - It finds the next timeout, sees that it happens in e.g. 5 milliseconds.
> - Then it calls kqueue() to wait for I/O for max 5 milliseconds
> - Then it notices that it actually returned more than 105 milliseconds later, and then logs a warning about it.
>
> So kqueue() apparently isn't very accurate in its timeout handling.

Actually another guess: Some people were saying it happens mainly on idle hours. Maybe kqueue() is accurate with low timeout values, but not accurate on high timeout values? So if Dovecot asked kqueue() to wait for <100ms, it would be very accurate. But if it asks to wait for 10000ms, kqueue() would think it's okay to return after 10100ms. If that's the case, this check could be changed to allow higher time jumps only on higher timeout waits.
Comment 7 Michael Grimm 2024-09-01 17:53:01 UTC
Created attachment 253253 [details]
Timo's patch

That's Timo's patch to silence those warnings.
I put that patch into mail/dovecot/files.
Comment 8 commit-hook freebsd_committer freebsd_triage 2024-09-01 18:07:09 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=eae4d9b6062a7f15a533a00c173d7b66f91ec675

commit eae4d9b6062a7f15a533a00c173d7b66f91ec675
Author:     Larry Rosenman <ler@FreeBSD.org>
AuthorDate: 2024-09-01 18:04:00 +0000
Commit:     Larry Rosenman <ler@FreeBSD.org>
CommitDate: 2024-09-01 18:06:13 +0000

    mail/dovecot: add patch from upstream to silence "time moved forwards"

    Patch from Timo Sirainen (via dovecot list)

    PR:     280929
    Reported by: trashcan@ellael.org

 mail/dovecot/Makefile                           |  1 +
 mail/dovecot/files/patch-src-lib-ioloop.c (new) | 25 +++++++++++++++++++++++++
 2 files changed, 26 insertions(+)
Comment 9 Michael Grimm 2024-09-01 18:19:52 UTC
(In reply to commit-hook from comment #8)

Thanks.

Now, let's keep our fingers crossed, that this is a fix for everyone, not only for me.
Comment 10 John R 2024-09-02 00:52:39 UTC
All quiet here, since I updated around seven hours ago, I haven't seen the "Time moved forward" message once.
Comment 11 Kota Shirasuka 2024-09-02 10:44:46 UTC
(In reply to commit-hook from comment #8)

I've been struggling with this issue for a long time.

Thanks!
Comment 12 Sebastian 2024-09-02 10:58:24 UTC
Also no "Time moved forward" log entries here on 2 servers for ~4 hours with that patch. 

THANK YOU to all involved in *finally* (and properly) fixing this!


Any chance this patch/commit will also be merged to the quarterly branch (2024Q3)?
Comment 13 commit-hook freebsd_committer freebsd_triage 2024-09-02 13:00:33 UTC
A commit in branch 2024Q3 references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=1de999acff01a9c7d916c63522c6a447f4f6e487

commit 1de999acff01a9c7d916c63522c6a447f4f6e487
Author:     Larry Rosenman <ler@FreeBSD.org>
AuthorDate: 2024-09-01 18:04:00 +0000
Commit:     Larry Rosenman <ler@FreeBSD.org>
CommitDate: 2024-09-02 13:00:10 +0000

    mail/dovecot: add patch from upstream to silence "time moved forwards"

    Patch from Timo Sirainen (via dovecot list)

    PR:     280929
    Reported by: trashcan@ellael.org

    (cherry picked from commit eae4d9b6062a7f15a533a00c173d7b66f91ec675)

 mail/dovecot/Makefile                           |  1 +
 mail/dovecot/files/patch-src-lib-ioloop.c (new) | 25 +++++++++++++++++++++++++
 2 files changed, 26 insertions(+)