Summary: | linux(4): linux_recvfrom(2) fails: linux_recvfrom -1 errno -22 Invalid argument | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Jason Mader <jasonmader> | ||||||
Component: | kern | Assignee: | Dmitry Chagin <dchagin> | ||||||
Status: | Closed FIXED | ||||||||
Severity: | Affects Some People | CC: | dchagin, emaste, trasz | ||||||
Priority: | --- | Keywords: | needs-qa | ||||||
Version: | 12.2-RELEASE | Flags: | koobs:
maintainer-feedback?
(trasz) koobs: mfc-stable13? koobs: mfc-stable12? koobs: mfc-stable11- |
||||||
Hardware: | amd64 | ||||||||
OS: | Any | ||||||||
Bug Depends on: | |||||||||
Bug Blocks: | 247219 | ||||||||
Attachments: |
|
Description
Jason Mader
2021-10-23 14:11:57 UTC
CC trasz@, but I expect we'll need more detail to have a chance of making progress here. (In reply to Ed Maste from comment #1) Of course. Let me know what I can do to provide more detail. ktrace was the only thing I could think of so far to see why the binaries weren't working. @Jason Can you detail the relevent daemon & utility programs along with steps to reproduce, and their upstream source repository links (if available) (In reply to Kubilay Kocak from comment #3) These are the Reprise software license manager and utility program for Linux x86_64, so the sources aren't available and will only work with a file for a specific system. I first noticed the problem trying to exit license server with, `rlmutil rlmdown RLM -q`: Read error from network (-105) Timeout on read() (comm: -13)Operation now in progress (errno: 115) This software does work on FreeBSD 11.2, setup very similar with the license manager process running in a jail. One thing that is different is that on FreeBSD 11.2 I am using an IP alias in the jail, but now I am using epair & bridge. (Because [Bug 258949] /32 netmask doesn't work with an alias in FreeBSD 13.0) Of note, there are 4 other Linux x86_64 license manager's working properly in the same jail. This is how it used to be behave in 11.2, 59822 rlm CALL linux_select(0x4000,0x7323e8,0,0,0x7fffffffcdd0) 59822 rlm RET linux_select 1 59822 rlm CALL linux_recvfrom(0x4,0x731704,0x6,0x4000,0x72c94c,0x7fffffffcdcc) 59822 rlm GIO fd 4 read 6 bytes 0x0000 0100 8e00 008f |......| 59822 rlm RET linux_recvfrom 6 59822 rlm CALL linux_select(0x4000,0x72c148,0,0,0x7fffffffcdd0) 59822 rlm RET linux_select 1 59822 rlm CALL linux_recvfrom(0x4,0x73170a,0x8e,0x4000,0x72c94c,0x7fffffffcdcc) 59822 rlm GIO fd 4 read 142 bytes 59822 rlm RET linux_recvfrom 142/0x8e ssize_t recvfrom(int sockfd, void *buf, size_t len, int flags, struct sockaddr *src_addr, socklen_t *addrlen); 0x4 0x731704 0x6 0x4000 0x72c94c 0x7fffffffcdcc This is the first recvfrom error on 13.0 that matches the above 6 byte read, looks like just getting a size for the next message, 35514 rlm CALL linux_select(0x4000,0x85e6d8,0,0,0x7fffffffddc0) 35514 rlm RET linux_select 1 35514 rlm CALL linux_recvfrom(0x4,0x85d914,0x6,0x4000,0x861c7c,0x7fffffffddf0) 35514 rlm GIO fd 4 read 6 bytes 0x0000 0100 b900 fdb7 |......| 35514 rlm RET linux_recvfrom -1 errno -22 Invalid argument If there is any way to find out more detail on why this linux_recvfrom() fails, please let me know and I'll provide that information. I’ve been testing releases, and have found that these Linux binaries worked as expected in FreeBSD 11.4 and 12.0; but not FreeBSD 12.1 and later. (In reply to Jason Mader from comment #6) Sorry, I made a mistake, this works in FreeBSD 12.1 as well; my problem begins in FreeBSD 12.2. linux_socket.c also changed significantly between 12.1 and 12.2. I reverted linux_recvfrom() in FreeBSD 12.3-BETA3 from FreeBSD 12.1 (adding the dependent functions linux_sa_put() and the prior version of bsd_to_linux_sockaddr() ) and the Linux binaries work in FreeBSD 12.3-BETA3. Not yet sure where the problem is exactly though. Created attachment 229379 [details]
revert linux_recvfrom() in linux_socket.c
After adding some debugging statements into linux_recvfrom(), I found that the error happens here,
error = kern_recvit(td, args->s, &msg, UIO_SYSSPACE, NULL);
if (error != 0)
goto out;
The value in error that is returned is: 54
I'm attaching a diff that reverts FreeBSD 12.3-BETA3 linux_socket.c to 12.1 and works for the Linux binaries, though I don't yet understand what the critical difference is to linux_recvfrom().
(In reply to Jason Mader from comment #9) When in FreeBSD 12.2+ linux_recvfrom() the problem seems to be at, error = bsd_to_linux_sockaddr(sa, &lsa, msg.msg_namelen); The old bsd_to_linux_sockaddr((struct sockaddr *)PTRIN(args->from)) is returning 0, but the new bsd_to_linux_sockaddr(sa, &lsa, msg.msg_namelen) is returning 22. (In reply to Jason Mader from comment #10) When linux_recvfrom() calls kern_recvit() the value of msg.msg_namelen is 28, and after the call it is 0. kern_recvit() source didn't change, but bsd_to_linux_sockaddr() did. Prior to FreeBSD 12.2, bsd_to_linux_sockaddr() didn't check the value of msg.msg_namelen (as len). Now it does, if (len < 2 || len > UCHAR_MAX) return (EINVAL); I am currently working around this with, --- linux_socket.c +++ linux_socket.c @@ -926,10 +926,10 @@ goto out; if (PTRIN(args->from) != NULL) { - error = bsd_to_linux_sockaddr(sa, &lsa, msg.msg_namelen); + error = bsd_to_linux_sockaddr(sa, &lsa, fromlen); if (error == 0) error = copyout(lsa, PTRIN(args->from), - msg.msg_namelen); + fromlen); free(lsa, M_SONAME); } Thanks for investigating this! Would it be possible for you to print out the value for both 'msg.msg_namelen' and 'fromlen' when this happens? (In reply to Edward Tomasz Napierala from comment #12) I changed linux_socket.c linux_recvfrom() from, if (PTRIN(args->from) != NULL) { error = linux_copyout_sockaddr(sa, PTRIN(args->from), msg.msg_namelen); to, if (PTRIN(args->from) != NULL) { printf("msg_namelen: %d, fromlen: %d\n", msg.msg_namelen, fromlen); error = linux_copyout_sockaddr(sa, PTRIN(args->from), fromlen); } And got, linux: jid 1 pid 77110 (rlmutil): unsupported socket(AF_NETLINK, 3, NETLINK_ROUTE) msg_namelen: 0, fromlen: 28 msg_namelen: 0, fromlen: 28 msg_namelen: 0, fromlen: 28 msg_namelen: 0, fromlen: 28 msg_namelen: 0, fromlen: 28 msg_namelen: 0, fromlen: 28 msg_namelen: 0, fromlen: 28 msg_namelen: 0, fromlen: 28 None of the other clients connecting to their servers do "(PTRIN(args->from) != NULL)" though, so there is no output, and why they are all working without the workaround. (In reply to Jason Mader from comment #11) hi, could you please show how socket created, ie find string linux_socket(..) I'm interested on domain, type and proto values, as not all of proto have PR_ADDR flag (In reply to Dmitry Chagin from comment #14) Hopefully this is what you're looking for, 69131 rlm CALL linux_socket(0x1,0x80801,0) 69131 rlm RET linux_socket 3 69131 rlm CALL linux_socket(0x1,0x80801,0) 69131 rlm RET linux_socket 3 69131 rlm CALL linux_socket(0xa,0x1,0) 69131 rlm RET linux_socket -1 errno -93 Protocol not supported 69131 rlm CALL linux_socket(0x2,0x1,0) 69131 rlm RET linux_socket 3 69131 rlm CALL linux_socket(0xa,0x1,0) 69131 rlm RET linux_socket -1 errno -93 Protocol not supported 69131 rlm CALL linux_socket(0x2,0x1,0) 69131 rlm RET linux_socket 4 (In reply to Jason Mader from comment #15) yes, last call to linux_socket, ie 69131 rlm CALL linux_socket(0x2,0x1,0) 69131 rlm RET linux_socket 4 and 35514 rlm CALL linux_recvfrom(0x4,0x85d914,0x6,0x4000,0x861c7c,0x7fffffffddf0) 35514 rlm GIO fd 4 read 6 bytes 0x0000 0100 b900 fdb7 |......| 35514 rlm RET linux_recvfrom -1 errno -22 Invalid argument means that connect-oriented protocol is is used, so msg.msg_namelen is set to 0. I'll prepare a simple patch soon Created attachment 232824 [details]
prevent copying of uninitialized source address
try this one please
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=bb46e9b5107fd8763742f7e55b66ea2e574f5815 commit bb46e9b5107fd8763742f7e55b66ea2e574f5815 Author: Dmitry Chagin <dchagin@FreeBSD.org> AuthorDate: 2022-04-11 20:29:45 +0000 Commit: Dmitry Chagin <dchagin@FreeBSD.org> CommitDate: 2022-04-11 20:29:45 +0000 linux(4): Prevent an attempt to copy an uninitialized source address. PR: 259380 MFC after: 3 days sys/compat/linux/linux_socket.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=d33fba34ab195d7d8bb5a8daabdad39ec7d9f4c4 commit d33fba34ab195d7d8bb5a8daabdad39ec7d9f4c4 Author: Dmitry Chagin <dchagin@FreeBSD.org> AuthorDate: 2022-04-11 20:29:45 +0000 Commit: Dmitry Chagin <dchagin@FreeBSD.org> CommitDate: 2022-06-17 19:33:50 +0000 linux(4): Prevent an attempt to copy an uninitialized source address. PR: 259380 MFC after: 3 days (cherry picked from commit bb46e9b5107fd8763742f7e55b66ea2e574f5815) sys/compat/linux/linux_socket.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) |