Created attachment 254879 [details] Simple Go executable that shows an adapted internal Go sendfile wrapper exercising the apparent bug I'm not currently able to dive deeper, but running a very basic Go HTTP webserver in the Linuxulator on 14.1-RELEASE-p5 gave me corrupted downloads. I dove into the standard Go libraries and found that ultimately, sendfile is used for such data transfers. My test code is a simplified version of the Go code that shows the main issue: the linux_sendfile syscall returns EAGAIN as errno and -1 as result, BUT there's actually data being sent. This is only the case when the TCP socket is marked as non-blocking. At EAGAIN the Go library code retries until success, which in practice means that much more data is sent than expected and pieces of data are repeated. Every time I execute the test, the output is different. My attached test code exercises this: % go build main.go % ./main & % nc -v localhost 4040 > testfile I'm stopping the loop at EAGAIN directly, but normally it would retry a number of times, every time a bit of data actually being sent, but the sender not knowing about it. The Linux sendfile man page indicates that sendfile can return early without having sent all data, but this isn't an error. So either linux_sendfile should return the amount of bytes sent and no error, OR not actually send any data and return EAGAIN and -1.
On main I compiled your program with: $ env GOOS=linux go build main.go $ brandelf -t Linux main then ran it with some dummy data: $ date > file2 $ ./main & $ nc -v localhost 4040 > testfile Entering loop, remain = 29 syscall returned n=29, err=%!s(<nil>) Sendfile syscall success, written: 29, remain: 0 Wrote 29 bytes, handled: %!s(bool=true)Connection to localhost 4040 port [tcp/*] succeeded! $ cat testfile Mon Nov 4 13:54:57 EST 2024 $ So something is different. Maybe something to do with the size of "file2"? Or the problem is in 14.1 but not FreeBSD-main. Or I don't quite understand the expected behaviour.
I used a test file of 1MB size; every time the results were different depending on the available buffers, e.g. +/- 140KByte was sent in the first sendfile call that returned EAGAIN. The file should be big enough to be able to fill buffers I think.
(In reply to pieter from comment #2) I see now, thanks. > So either linux_sendfile should return the amount of bytes sent and no error, OR not actually send any data and return EAGAIN and -1. Got it. I believe this patch will fix it, though I haven't tested it yet: https://reviews.freebsd.org/D47447
I tested the patch, but it's not 100% working yet. My test program first sends 147456 bytes (of 1MB), so it calls sendfile again with the remainder of 901120 bytes. Then sendfile returns with 0 bytes sent and errno not set: syscall returned n=147456, err=%!s(<nil>) Sendfile syscall success, written: 147456, remain: 901120 Entering loop, remain = 901120 syscall returned n=0, err=%!s(<nil>) I see the same kind of behavior in the standard Go net/http library, where a HTTP file download is cut off and the client retries to get the rest.
(In reply to pieter from comment #4) I updated and tested the patch and fixed this, thanks. Your test program still exits sometimes since it doesn't handle EAGAIN by retrying or blocking, but I believe the kernel's behaviour is now correct.
I had removed the exit call to further test indeed and also tested the initial patch with the Go http server library. Patched and built a new kernel with the updated patch, but haven't had time yet to test. Will do so asap and comment here. Thanks for the effort so far, it's highly appreciated!
(In reply to Mark Johnston from comment #5) I have now tested the current improved patch and can confirm that it resolved the issue for me. Thanks a lot! I'm not able to perform regression tests or test other aspects of the Linux sendfile implementation, but at least the test case and Go HTTP library now function as expected.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=a43b745aaf4f5bbc96875d2ab3ec9bea8024eda4 commit a43b745aaf4f5bbc96875d2ab3ec9bea8024eda4 Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2024-11-13 14:15:47 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2024-11-13 14:15:47 +0000 linux sendfile: Fix handling of non-blocking sockets FreeBSD sendfile() may perform a partial transfer and return EAGAIN if the socket is non-blocking. Linux sendfile() expects no error in this case, so squash EAGAIN. PR: 282495 Tested by: pieter@krikkit.xyz MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D47447 sys/compat/linux/linux_socket.c | 7 +++++++ 1 file changed, 7 insertions(+)
Thank you for testing.
A commit in branch stable/14 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=91a80544737486dbe5bbae28c05c2a7a654cb61f commit 91a80544737486dbe5bbae28c05c2a7a654cb61f Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2024-11-13 14:15:47 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2024-11-20 21:41:08 +0000 linux sendfile: Fix handling of non-blocking sockets FreeBSD sendfile() may perform a partial transfer and return EAGAIN if the socket is non-blocking. Linux sendfile() expects no error in this case, so squash EAGAIN. PR: 282495 Tested by: pieter@krikkit.xyz MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D47447 (cherry picked from commit a43b745aaf4f5bbc96875d2ab3ec9bea8024eda4) sys/compat/linux/linux_socket.c | 7 +++++++ 1 file changed, 7 insertions(+)
Thanks for the report and for testing.