After commit 2886c93d1bca231260ebc01d4205743ca781f3c7 , it is possible for syslog(3) to spin the CPU without making any progress, because send() returns ENOBUFS for messages over about 4180 bytes. dtrace shows that allthough the sockbuf size is set to 8192 bytes, it's sb_hiwat value is only 4096 at the time of the bug. I don't yet know why. The bug can be produced by installing rsyslogd from ports and running the following command. I have not yet been able to reproduce it using syslogd. daemon -f -s notice -l user jot -s "" -r -p 0 5000 0 9
So the problem seems to be that: * The default receive buffer size for a unix domain socket is 4 kB * connectlog() sets SO_SNDBUF to 8 kB * BUT, unix domain datagram sockets don't really have a send buffer. Unlike inet sockets, they stay on one host, so everything goes straight to the receive buffer. See uipc_send(); it writes into so2->so_rcv, not so->so_snd. * syslogd sets SO_RCVBUF to 80 kB, which is why nobody noticed the problem there. * rsyslogd uses the default SO_RCVBUF of 4 kB * syslog(3) chops up messages that are longer than MAXLINE. But now MAXLINE is 8 kB, and the SO_RCVBUF is only 4 kB. So the chopped up messages are too large for the buffer. * syslog(3) loops while (errno == ENOBUFS), which is not appropriate if the message is too big to ever fit. BUT, it fails fast if the socket is privileged. Many daemons are, so their operators won't notice this bug, but my daemon is not.
(In reply to Alan Somers from comment #1) Need to fix rsyslogd, SO_RCVBUF=4kB looks wrong.
There's nothing wrong with SO_RCVBUF=4kB. That's the default, and rsyslogd never changes it. I think the solution is to: 1) Change syslog(3)'s MAXLINE to 4kB, so it's the same as the default SO_RCVBUF 2) Maybe change the loop around ENOBUFS in vsyslog1, but no change is obviously correct. Perhaps fail fast if cnt > 1024, or perhaps query the socket's true sockbuf size, but I'm not sure how to do that.
(In reply to Alan Somers from comment #3) Many modern programs generate large log messages, reducing the max message size to 4kb is a bad idea. Infinite loop with ENOBUFS should be limited by the number of repetitions.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=943c446629e33739d3f72795069a5c944e8f329e commit 943c446629e33739d3f72795069a5c944e8f329e Author: Alan Somers <asomers@FreeBSD.org> AuthorDate: 2021-11-30 17:06:25 +0000 Commit: Alan Somers <asomers@FreeBSD.org> CommitDate: 2021-11-30 18:11:43 +0000 Revert "libc: Some enhancements to syslog(3)" This reverts commit 2886c93d1bca231260ebc01d4205743ca781f3c7. The original commit has two problems: * It sets SO_SNDBUF to be as large as MAXLINE. But for unix domain sockets, the send buffer is bypassed. Packets go directly to the peer's receive buffer, so setting and querying SO_SNDBUF is ineffective. To ensure that the socket can accept messages of a certain size, it would be necessary to add a SO_PEERRCVBUF socket option that could query the connected peer's receive buffer size. * It sets MAXLINE to 8 kB, which is larger than the default sockbuf size of 4 kB. That's ok for the builtin syslogd, which sets its recvbuf to 80 kB, but not ok for alternative sysloggers, like rsyslogd, which use the default size. As a consequence, writing messages of more than 4 kB with syslog() as a non-root user while running rsyslogd would cause the logging application to spin indefinitely within syslog(). PR: 260126 MFC: 2 weeks Sponsored by: Axcient Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D33199 lib/libc/gen/syslog.c | 15 +-------------- 1 file changed, 1 insertion(+), 14 deletions(-)
Committed to main. I'll MFC to stable/13 in two weeks. But the offending commit was never MFCed to stable/12, so there's no need to MFC there.
(In reply to Alan Somers from comment #1) Setting SO_SNDBUF changes limit net.local.dgram.maxdgram=2048.
It looks like syslog(3)'s ENOBUFS handling comes from PR 72366. glibc doesn't appear to do anything special to handle ENOBUFS (i.e., upon an error the message is just dropped, I think), and OpenBSD has a special sendsyslog system call which supports messages up to 8192 bytes and will put messages in a global ring buffer upon ENOBUFS from sosend(). (Though, that ring buffer only gets drained when something tries to log again, it looks like messages could linger there indefinitely.) It also looks like OpenBSD quite recently raised the default recv dgram socket buffer size to 16KB. I can see two complementary approaches, then: 1. change syslog(3) to truncate the message to some minimum size before retrying upon ENOBUFS 2. raise socket buffer defaults to accommodate a maximum line length of 8192. I think the latter is sufficient and will hopefully be uncontroversial: index 0ec1ba677173..e0e08470d36a 100644 --- a/sys/kern/uipc_usrreq.c +++ b/sys/kern/uipc_usrreq.c @@ -157,7 +157,7 @@ static struct task unp_defer_task; static u_long unpst_sendspace = PIPSIZ; static u_long unpst_recvspace = PIPSIZ; static u_long unpdg_sendspace = 2*1024; /* really max datagram size */ -static u_long unpdg_recvspace = 4*1024; +static u_long unpdg_recvspace = 16*1024; static u_long unpsp_sendspace = PIPSIZ; /* really max datagram size */ static u_long unpsp_recvspace = PIPSIZ;
... I should point out that raising the limit to 8192 is not sufficient, since the kernel includes a sockaddr in each dgram message over a local socket, and that counts towards socket buffer space. So really it just needs to be 8192 + epsilon, but 16384 seems pretty reasonable.
Raising the default buffer size seems sensible to me.
Created attachment 229835 [details] syslog.c.diff limit the number of attempts
(In reply to Dmitry Wagin from comment #11) I think it is better to deliver a truncated a message than it is to give up.
(In reply to Mark Johnston from comment #12) Initially, this cycle solved the problem of local overloads. This problem has not gone anywhere. When should the function start truncating messages?
(In reply to Dmitry Wagin from comment #13) Sorry, I missed the question. RFC 5424 defines a minimum message size. My notion was that after N ~ 10 failed attempts we can truncate to that size and/or log a short message indicating that the message was dropped. Anyway I posted a patch to increase the default, so after we can restore your original change.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=d157f2627b238ba38d1cad36f1aa165c42ce91ee commit d157f2627b238ba38d1cad36f1aa165c42ce91ee Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2021-12-17 15:59:29 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2021-12-17 18:09:49 +0000 unix: Increase the default datagram recv buffer size syslog(3) was recently change to support larger messages, up to 8KB. Our syslogd handles this fine, as it adjusts /dev/log's recv buffer to a large size. rsyslog, however, uses the system default of 4KB. This leads to problems since our syslog(3) retries indefinitely when a send() returns ENOBUFS, but if the message is large enough this will never succeed. Increase the default recv buffer size for datagram sockets to support 8KB syslog messages without requiring the logging daemon to adjust its buffers. PR: 260126 Reviewed by: asomers MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D33380 sys/kern/uipc_usrreq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=7a06849669ac9fe936a89c908e464afbfbf85bf8 commit 7a06849669ac9fe936a89c908e464afbfbf85bf8 Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2021-12-17 15:59:29 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2021-12-31 14:25:54 +0000 unix: Increase the default datagram recv buffer size syslog(3) was recently change to support larger messages, up to 8KB. Our syslogd handles this fine, as it adjusts /dev/log's recv buffer to a large size. rsyslog, however, uses the system default of 4KB. This leads to problems since our syslog(3) retries indefinitely when a send() returns ENOBUFS, but if the message is large enough this will never succeed. Increase the default recv buffer size for datagram sockets to support 8KB syslog messages without requiring the logging daemon to adjust its buffers. PR: 260126 Reviewed by: asomers Sponsored by: The FreeBSD Foundation (cherry picked from commit d157f2627b238ba38d1cad36f1aa165c42ce91ee) sys/kern/uipc_usrreq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=881059955a5d83fc173f40f266d93ddd91255291 commit 881059955a5d83fc173f40f266d93ddd91255291 Author: Dmitry Wagin <dmitry.wagin@ya.ru> AuthorDate: 2021-03-23 16:01:15 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2021-12-31 17:57:01 +0000 libc: Some enhancements to syslog(3) This is a re-application of commit 2d82b47a5b4ef18550565dd55628d51f54d0af2e, which was reverted since it broke with syslog daemons that don't adjust the /dev/log recv buffer size. Now that the default is large enough to accomodate 8KB messages, restore support for large messages. PR: 260126 lib/libc/gen/syslog.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-)
The revert never made it to stable/13.