Created attachment 235121 [details] Extraced "checking read/send-side pipe system..." C program Hello, recent changes in CURRENT causes the Dante socks client to fail. Usually this could be fixed simply by recompiling and reinstalling net/dante. This time it failed because the configure script stops at "checking read/send-side pipe system..." Investigation shows that the C program executed loops forever. Trying to bisect CURRENT stops for me at 716fd348e01c5f2ba125f878a634a753436c2994 because the boot zpool was upgraded meanwhile and an older CURRENT version can not boot from such a pool. 716fd348e01c5f2ba125f878a634a753436c2994 does expose the new (loop) behavior as well, so the cause for the endless loop must be earlier. Adding some additional fprintf's showed that the C program does not exit the wile(1)-loop in sendtest() at the first call of sendtest(). Ralf
Can you double-check the code? I think a clowing curly bracket is missing in the function sendtest...
Created attachment 235123 [details] Extraced "checking read/send-side pipe system..." C program Restored missing "}" line.
(In reply to Martin Waschbüsch from comment #1) While removing some of my added fprintf's I accidentally removed one line too much. Restored from original source. Sorry.
Gleb, this looks like it's probably some fallout from the recent PF_LOCAL/DGRAM changes?
This is the same regression that was fixed for send(2) in d2b3a0ed31ef6270c712d0779a95bb222df88909. Now I have the same problem with write(2). Thinking on how to fix that without introducing socket layer knowledge to file level.
(In reply to Gleb Smirnoff from comment #5) Can't it live in soo_write()?
Like fixup uio_resid in case of a transient error?
I have an idea...
Created attachment 235364 [details] Suggested patch Can you please try this out? Mark, what's your opinion?
(In reply to Gleb Smirnoff from comment #9) I see. Since we already have a layering violation there I guess it's not a major sin to extend it a bit. I'd move the PR_ATOMIC+resid check into a subroutine, to be shared with kern_sendit(). In dofilewrite(), I think the code is too concise; IMHO the following is clearer: error = fo_write(...); if (error != 0 && fp->f_type != DTYPE_SOCKET) { ... } Overall LGTM though.
Ended up with a bigger change https://reviews.freebsd.org/D35863
(In reply to Gleb Smirnoff from comment #9, comment #11) Applied the patches from review D35863 to main-n256821-102f31bf36ed and tried Dantes make configure on that. Configure terminates again, so the triggered bug is fixed with those patches applied. Thank you. Ralf
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=7a2c93b86ef75390a60a4b4d6e3911b36221dfbe commit 7a2c93b86ef75390a60a4b4d6e3911b36221dfbe Author: Gleb Smirnoff <glebius@FreeBSD.org> AuthorDate: 2022-12-14 18:02:44 +0000 Commit: Gleb Smirnoff <glebius@FreeBSD.org> CommitDate: 2022-12-14 18:02:44 +0000 sockets: provide sousrsend() that does socket specific error handling Sockets have special handling for EPIPE on a write, that was spread out into several places. Treating transient errors is also special - if protocol is atomic, than we should ignore any changes to uio_resid, a transient error means the write had completely failed (see d2b3a0ed31e). - Provide sousrsend() that expects a valid uio, and leave sosend() for kernel consumers only. Do all special error handling right here. - In dofilewrite() don't do special handling of error for DTYPE_SOCKET. - For send(2), write(2) and aio_write(2) call into sousrsend() and remove error handling for kern_sendit(), soo_write() and soaio_process_job(). PR: 265087 Reported by: rz-rpi03 at h-ka.de Reviewed by: markj Differential revision: https://reviews.freebsd.org/D35863 sys/kern/sys_generic.c | 10 +++++++--- sys/kern/sys_socket.c | 17 +++-------------- sys/kern/uipc_socket.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++++ sys/kern/uipc_syscalls.c | 16 +--------------- sys/sys/socketvar.h | 2 ++ 5 files changed, 62 insertions(+), 32 deletions(-)
Committed to main branch. Sorry for the delay.