Bug 260126 - syslog(3) spins the CPU with 4180 byte messages
Summary: syslog(3) spins the CPU with 4180 byte messages
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 13.0-STABLE
Hardware: Any Any
: --- Affects Only Me
Assignee: Alan Somers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-11-29 21:40 UTC by Alan Somers
Modified: 2021-12-31 18:14 UTC (History)
2 users (show)

See Also:
asomers: mfc-stable13?
asomers: mfc-stable12-


Attachments
syslog.c.diff (823 bytes, patch)
2021-12-01 21:45 UTC, Dmitry Wagin
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alan Somers freebsd_committer freebsd_triage 2021-11-29 21:40:33 UTC
After commit 2886c93d1bca231260ebc01d4205743ca781f3c7 , it is possible for syslog(3) to spin the CPU without making any progress, because send() returns ENOBUFS for messages over about 4180 bytes.  dtrace shows that allthough the sockbuf size is set to 8192 bytes, it's sb_hiwat value is only 4096 at the time of the bug.  I don't yet know why.

The bug can be produced by installing rsyslogd from ports and running the following command.  I have not yet been able to reproduce it using syslogd.

daemon -f -s notice -l user  jot -s "" -r -p 0 5000 0 9
Comment 1 Alan Somers freebsd_committer freebsd_triage 2021-11-30 00:13:23 UTC
So the problem seems to be that:

* The default receive buffer size for a unix domain socket is 4 kB
* connectlog() sets SO_SNDBUF to 8 kB
* BUT, unix domain datagram sockets don't really have a send buffer.  Unlike inet sockets, they stay on one host, so everything goes straight to the receive buffer.  See uipc_send(); it writes into so2->so_rcv, not so->so_snd.
* syslogd sets SO_RCVBUF to 80 kB, which is why nobody noticed the problem there.
* rsyslogd uses the default SO_RCVBUF of 4 kB
* syslog(3) chops up messages that are longer than MAXLINE.  But now MAXLINE is 8 kB, and the SO_RCVBUF is only 4 kB.  So the chopped up messages are too large for the buffer.
* syslog(3) loops while (errno == ENOBUFS), which is not appropriate if the message is too big to ever fit.  BUT, it fails fast if the socket is privileged.  Many daemons are, so their operators won't notice this bug, but my daemon is not.
Comment 2 Dmitry Wagin 2021-11-30 00:35:44 UTC
(In reply to Alan Somers from comment #1)
Need to fix rsyslogd, SO_RCVBUF=4kB looks wrong.
Comment 3 Alan Somers freebsd_committer freebsd_triage 2021-11-30 05:12:59 UTC
There's nothing wrong with SO_RCVBUF=4kB.  That's the default, and rsyslogd never changes it.  I think the solution is to:
1) Change syslog(3)'s MAXLINE to 4kB, so it's the same as the default SO_RCVBUF
2) Maybe change the loop around ENOBUFS in vsyslog1, but no change is obviously correct.  Perhaps fail fast if cnt > 1024, or perhaps query the socket's true sockbuf size, but I'm not sure how to do that.
Comment 4 Dmitry Wagin 2021-11-30 17:29:07 UTC
(In reply to Alan Somers from comment #3)

Many modern programs generate large log messages, reducing the max message size to 4kb is a bad idea.
Infinite loop with ENOBUFS should be limited by the number of repetitions.
Comment 5 commit-hook freebsd_committer freebsd_triage 2021-11-30 18:13:26 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=943c446629e33739d3f72795069a5c944e8f329e

commit 943c446629e33739d3f72795069a5c944e8f329e
Author:     Alan Somers <asomers@FreeBSD.org>
AuthorDate: 2021-11-30 17:06:25 +0000
Commit:     Alan Somers <asomers@FreeBSD.org>
CommitDate: 2021-11-30 18:11:43 +0000

    Revert "libc: Some enhancements to syslog(3)"

    This reverts commit 2886c93d1bca231260ebc01d4205743ca781f3c7.
    The original commit has two problems:

    * It sets SO_SNDBUF to be as large as MAXLINE.  But for unix domain
      sockets, the send buffer is bypassed.  Packets go directly to the
      peer's receive buffer, so setting and querying SO_SNDBUF is
      ineffective.  To ensure that the socket can accept messages of a
      certain size, it would be necessary to add a SO_PEERRCVBUF socket
      option that could query the connected peer's receive buffer size.

    * It sets MAXLINE to 8 kB, which is larger than the default sockbuf size
      of 4 kB.  That's ok for the builtin syslogd, which sets its recvbuf
      to 80 kB, but not ok for alternative sysloggers, like rsyslogd, which
      use the default size.

    As a consequence, writing messages of more than 4 kB with syslog() as a
    non-root user while running rsyslogd would cause the logging application
    to spin indefinitely within syslog().

    PR:             260126
    MFC:            2 weeks
    Sponsored by:   Axcient
    Reviewed by:    markj
    Differential Revision: https://reviews.freebsd.org/D33199

 lib/libc/gen/syslog.c | 15 +--------------
 1 file changed, 1 insertion(+), 14 deletions(-)
Comment 6 Alan Somers freebsd_committer freebsd_triage 2021-11-30 18:15:26 UTC
Committed to main.  I'll MFC to stable/13 in two weeks.  But the offending commit was never MFCed to stable/12, so there's no need to MFC there.
Comment 7 Dmitry Wagin 2021-11-30 18:33:42 UTC
(In reply to Alan Somers from comment #1)

Setting SO_SNDBUF changes limit net.local.dgram.maxdgram=2048.
Comment 8 Mark Johnston freebsd_committer freebsd_triage 2021-12-01 19:10:54 UTC
It looks like syslog(3)'s ENOBUFS handling comes from PR 72366.  glibc doesn't appear to do anything special to handle ENOBUFS (i.e., upon an error the message is just dropped, I think), and OpenBSD has a special sendsyslog system call which supports messages up to 8192 bytes and will put messages in a global ring buffer upon ENOBUFS from sosend().  (Though, that ring buffer only gets drained when something tries to log again, it looks like messages could linger there indefinitely.)  It also looks like OpenBSD quite recently raised the default recv dgram socket buffer size to 16KB.

I can see two complementary approaches, then:
1. change syslog(3) to truncate the message to some minimum size before retrying upon ENOBUFS
2. raise socket buffer defaults to accommodate a maximum line length of 8192.

I think the latter is sufficient and will hopefully be uncontroversial:

index 0ec1ba677173..e0e08470d36a 100644
--- a/sys/kern/uipc_usrreq.c
+++ b/sys/kern/uipc_usrreq.c
@@ -157,7 +157,7 @@ static struct task  unp_defer_task;
 static u_long  unpst_sendspace = PIPSIZ;
 static u_long  unpst_recvspace = PIPSIZ;
 static u_long  unpdg_sendspace = 2*1024;       /* really max datagram size */
-static u_long  unpdg_recvspace = 4*1024;
+static u_long  unpdg_recvspace = 16*1024;
 static u_long  unpsp_sendspace = PIPSIZ;       /* really max datagram size */
 static u_long  unpsp_recvspace = PIPSIZ;
Comment 9 Mark Johnston freebsd_committer freebsd_triage 2021-12-01 19:14:26 UTC
... I should point out that raising the limit to 8192 is not sufficient, since the kernel includes a sockaddr in each dgram message over a local socket, and that counts towards socket buffer space.  So really it just needs to be 8192 + epsilon, but 16384 seems pretty reasonable.
Comment 10 Alan Somers freebsd_committer freebsd_triage 2021-12-01 20:58:02 UTC
Raising the default buffer size seems sensible to me.
Comment 11 Dmitry Wagin 2021-12-01 21:45:50 UTC
Created attachment 229835 [details]
syslog.c.diff

limit the number of attempts
Comment 12 Mark Johnston freebsd_committer freebsd_triage 2021-12-01 22:30:38 UTC
(In reply to Dmitry Wagin from comment #11)
I think it is better to deliver a truncated a message than it is to give up.
Comment 13 Dmitry Wagin 2021-12-01 22:57:27 UTC
(In reply to Mark Johnston from comment #12)

Initially, this cycle solved the problem of local overloads. This problem has not gone anywhere.

When should the function start truncating messages?
Comment 14 Mark Johnston freebsd_committer freebsd_triage 2021-12-10 21:28:18 UTC
(In reply to Dmitry Wagin from comment #13)
Sorry, I missed the question.

RFC 5424 defines a minimum message size.  My notion was that after N ~ 10 failed attempts we can truncate to that size and/or log a short message indicating that the message was dropped.

Anyway I posted a patch to increase the default, so after we can restore your original change.
Comment 15 commit-hook freebsd_committer freebsd_triage 2021-12-17 18:12:24 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=d157f2627b238ba38d1cad36f1aa165c42ce91ee

commit d157f2627b238ba38d1cad36f1aa165c42ce91ee
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2021-12-17 15:59:29 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2021-12-17 18:09:49 +0000

    unix: Increase the default datagram recv buffer size

    syslog(3) was recently change to support larger messages, up to 8KB.
    Our syslogd handles this fine, as it adjusts /dev/log's recv buffer to a
    large size.  rsyslog, however, uses the system default of 4KB.  This
    leads to problems since our syslog(3) retries indefinitely when a send()
    returns ENOBUFS, but if the message is large enough this will never
    succeed.

    Increase the default recv buffer size for datagram sockets to support
    8KB syslog messages without requiring the logging daemon to adjust its
    buffers.

    PR:             260126
    Reviewed by:    asomers
    MFC after:      2 weeks
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D33380

 sys/kern/uipc_usrreq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Comment 16 commit-hook freebsd_committer freebsd_triage 2021-12-31 14:26:36 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=7a06849669ac9fe936a89c908e464afbfbf85bf8

commit 7a06849669ac9fe936a89c908e464afbfbf85bf8
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2021-12-17 15:59:29 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2021-12-31 14:25:54 +0000

    unix: Increase the default datagram recv buffer size

    syslog(3) was recently change to support larger messages, up to 8KB.
    Our syslogd handles this fine, as it adjusts /dev/log's recv buffer to a
    large size.  rsyslog, however, uses the system default of 4KB.  This
    leads to problems since our syslog(3) retries indefinitely when a send()
    returns ENOBUFS, but if the message is large enough this will never
    succeed.

    Increase the default recv buffer size for datagram sockets to support
    8KB syslog messages without requiring the logging daemon to adjust its
    buffers.

    PR:             260126
    Reviewed by:    asomers
    Sponsored by:   The FreeBSD Foundation

    (cherry picked from commit d157f2627b238ba38d1cad36f1aa165c42ce91ee)

 sys/kern/uipc_usrreq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Comment 17 commit-hook freebsd_committer freebsd_triage 2021-12-31 18:10:17 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=881059955a5d83fc173f40f266d93ddd91255291

commit 881059955a5d83fc173f40f266d93ddd91255291
Author:     Dmitry Wagin <dmitry.wagin@ya.ru>
AuthorDate: 2021-03-23 16:01:15 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2021-12-31 17:57:01 +0000

    libc: Some enhancements to syslog(3)

    This is a re-application of commit
    2d82b47a5b4ef18550565dd55628d51f54d0af2e, which was reverted since it
    broke with syslog daemons that don't adjust the /dev/log recv buffer
    size.  Now that the default is large enough to accomodate 8KB messages,
    restore support for large messages.

    PR:             260126

 lib/libc/gen/syslog.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)
Comment 18 Mark Johnston freebsd_committer freebsd_triage 2021-12-31 18:14:46 UTC
The revert never made it to stable/13.