Bug 259380

Summary: linux(4): linux_recvfrom(2) fails: linux_recvfrom -1 errno -22 Invalid argument
Product: Base System Reporter: Jason Mader <jasonmader>
Component: kernAssignee: Dmitry Chagin <dchagin>
Status: Closed FIXED    
Severity: Affects Some People CC: dchagin, emaste, trasz
Priority: --- Keywords: needs-qa
Version: 12.2-RELEASEFlags: koobs: maintainer-feedback? (trasz)
koobs: mfc-stable13?
koobs: mfc-stable12?
koobs: mfc-stable11-
Hardware: amd64   
OS: Any   
Bug Depends on:    
Bug Blocks: 247219    
Attachments:
Description Flags
revert linux_recvfrom() in linux_socket.c
none
prevent copying of uninitialized source address none

Description Jason Mader 2021-10-23 14:11:57 UTC
In a FreeBSD 13.0 jail with Linux compatibility, one pair of Linux daemon and utility programs do not communicate properly. They used to work in FreeBSD 11.2. Here is a ktrace of the daemon receiving part of a message; it's like this for every 6 bytes. I am guessing the problem is "linux_recvfrom -1 errno -22 Invalid argument"


 92539 rlm      CALL  linux_select(0x4000,0x85e6d8,0,0,0x7fffffffddd0)
 92539 rlm      RET   linux_select 1
 92539 rlm      CALL  linux_recvfrom(0x4,0x85d914,0x6,0x4000,0x861c7c,0x7fffffffde00)
 92539 rlm      GIO   fd 4 read 6 bytes
       ",2,5,0"
 92539 rlm      RET   linux_recvfrom -1 errno -22 Invalid argument
 92539 rlm      CALL  linux_time(0x7fffffffdf38)
Comment 1 Ed Maste freebsd_committer freebsd_triage 2021-10-25 14:53:28 UTC
CC trasz@, but I expect we'll need more detail to have a chance of making progress here.
Comment 2 Jason Mader 2021-10-26 06:38:16 UTC
(In reply to Ed Maste from comment #1)
Of course. Let me know what I can do to provide more detail. ktrace was the only thing I could think of so far to see why the binaries weren't working.
Comment 3 Kubilay Kocak freebsd_committer freebsd_triage 2021-10-26 23:43:17 UTC
@Jason Can you detail the relevent daemon & utility programs along with steps to reproduce, and their upstream source repository links (if available)
Comment 4 Jason Mader 2021-10-27 16:00:07 UTC
(In reply to Kubilay Kocak from comment #3)
These are the Reprise software license manager and utility program for Linux x86_64, so the sources aren't available and will only work with a file for a specific system. I first noticed the problem trying to exit license server with, `rlmutil rlmdown RLM -q`:

Read error from network (-105)
Timeout on read() (comm: -13)Operation now in progress (errno: 115)

This software does work on FreeBSD 11.2, setup very similar with the license manager process running in a jail. One thing that is different is that on FreeBSD 11.2 I am using an IP alias in the jail, but now I am using epair & bridge. (Because [Bug 258949] /32 netmask doesn't work with an alias in FreeBSD 13.0) Of note, there are 4 other Linux x86_64 license manager's working properly in the same jail.
Comment 5 Jason Mader 2021-11-03 20:33:04 UTC
This is how it used to be behave in 11.2,

 59822 rlm      CALL  linux_select(0x4000,0x7323e8,0,0,0x7fffffffcdd0)
 59822 rlm      RET   linux_select 1
 59822 rlm      CALL  linux_recvfrom(0x4,0x731704,0x6,0x4000,0x72c94c,0x7fffffffcdcc)
 59822 rlm      GIO   fd 4 read 6 bytes
       0x0000 0100 8e00 008f                                                                                       |......|

 59822 rlm      RET   linux_recvfrom 6
 59822 rlm      CALL  linux_select(0x4000,0x72c148,0,0,0x7fffffffcdd0)
 59822 rlm      RET   linux_select 1
 59822 rlm      CALL  linux_recvfrom(0x4,0x73170a,0x8e,0x4000,0x72c94c,0x7fffffffcdcc)
 59822 rlm      GIO   fd 4 read 142 bytes
 59822 rlm      RET   linux_recvfrom 142/0x8e

ssize_t recvfrom(int sockfd, void *buf, size_t len, int flags, struct sockaddr *src_addr, socklen_t *addrlen);
                        0x4   0x731704         0x6     0x4000                   0x72c94c      0x7fffffffcdcc

This is the first recvfrom error on 13.0 that matches the above 6 byte read, looks like just getting a size for the next message,

 35514 rlm      CALL  linux_select(0x4000,0x85e6d8,0,0,0x7fffffffddc0)
 35514 rlm      RET   linux_select 1
 35514 rlm      CALL  linux_recvfrom(0x4,0x85d914,0x6,0x4000,0x861c7c,0x7fffffffddf0)
 35514 rlm      GIO   fd 4 read 6 bytes
       0x0000 0100 b900 fdb7                                                                                       |......|

 35514 rlm      RET   linux_recvfrom -1 errno -22 Invalid argument

If there is any way to find out more detail on why this linux_recvfrom() fails, please let me know and I'll provide that information.
Comment 6 Jason Mader 2021-11-07 16:49:55 UTC
I’ve been testing releases, and have found that these Linux binaries worked as expected in FreeBSD 11.4 and 12.0; but not FreeBSD 12.1 and later.
Comment 7 Jason Mader 2021-11-07 17:19:20 UTC
(In reply to Jason Mader from comment #6)
Sorry, I made a mistake, this works in FreeBSD 12.1 as well; my problem begins in FreeBSD 12.2.

linux_socket.c also changed significantly between 12.1 and 12.2.
Comment 8 Jason Mader 2021-11-08 19:44:14 UTC
I reverted linux_recvfrom() in FreeBSD 12.3-BETA3 from FreeBSD 12.1 (adding the dependent functions linux_sa_put() and the prior version of bsd_to_linux_sockaddr() ) and the Linux binaries work in FreeBSD 12.3-BETA3. Not yet sure where the problem is exactly though.
Comment 9 Jason Mader 2021-11-09 10:55:40 UTC
Created attachment 229379 [details]
revert linux_recvfrom() in linux_socket.c

After adding some debugging statements into linux_recvfrom(), I found that the error happens here,

	error = kern_recvit(td, args->s, &msg, UIO_SYSSPACE, NULL);
	if (error != 0)
		goto out;

The value in error that is returned is: 54

I'm attaching a diff that reverts FreeBSD 12.3-BETA3 linux_socket.c to 12.1 and works for the Linux binaries, though I don't yet understand what the critical difference is to linux_recvfrom().
Comment 10 Jason Mader 2021-11-11 16:14:26 UTC
(In reply to Jason Mader from comment #9)
When in FreeBSD 12.2+ linux_recvfrom() the problem seems to be at, 

	error = bsd_to_linux_sockaddr(sa, &lsa, msg.msg_namelen);

The old bsd_to_linux_sockaddr((struct sockaddr *)PTRIN(args->from)) is returning 0, but the new bsd_to_linux_sockaddr(sa, &lsa, msg.msg_namelen) is returning 22.
Comment 11 Jason Mader 2021-11-12 00:46:33 UTC
(In reply to Jason Mader from comment #10)
When linux_recvfrom() calls kern_recvit() the value of msg.msg_namelen is 28, and after the call it is 0.

kern_recvit() source didn't change, but bsd_to_linux_sockaddr() did. Prior to FreeBSD 12.2, bsd_to_linux_sockaddr() didn't check the value of msg.msg_namelen (as len). Now it does,

	if (len < 2 || len > UCHAR_MAX)
		return (EINVAL);

I am currently working around this with,

--- linux_socket.c
+++ linux_socket.c
@@ -926,10 +926,10 @@
 		goto out;

 	if (PTRIN(args->from) != NULL) {
-		error = bsd_to_linux_sockaddr(sa, &lsa, msg.msg_namelen);
+		error = bsd_to_linux_sockaddr(sa, &lsa, fromlen);
 		if (error == 0)
 			error = copyout(lsa, PTRIN(args->from),
-			    msg.msg_namelen);
+			    fromlen);
 		free(lsa, M_SONAME);
 	}
Comment 12 Edward Tomasz Napierala freebsd_committer freebsd_triage 2021-11-16 13:07:40 UTC
Thanks for investigating this!  Would it be possible for you to print out the value for both 'msg.msg_namelen' and 'fromlen' when this happens?
Comment 13 Jason Mader 2021-11-16 17:30:15 UTC
(In reply to Edward Tomasz Napierala from comment #12)
I changed linux_socket.c linux_recvfrom() from,

        if (PTRIN(args->from) != NULL) {
                error = linux_copyout_sockaddr(sa, PTRIN(args->from), msg.msg_namelen);

to,
        if (PTRIN(args->from) != NULL) {
                printf("msg_namelen: %d, fromlen: %d\n", msg.msg_namelen, fromlen);
                error = linux_copyout_sockaddr(sa, PTRIN(args->from), fromlen);
        }

And got,

linux: jid 1 pid 77110 (rlmutil): unsupported socket(AF_NETLINK, 3, NETLINK_ROUTE)
msg_namelen: 0, fromlen: 28
msg_namelen: 0, fromlen: 28
msg_namelen: 0, fromlen: 28
msg_namelen: 0, fromlen: 28
msg_namelen: 0, fromlen: 28
msg_namelen: 0, fromlen: 28
msg_namelen: 0, fromlen: 28
msg_namelen: 0, fromlen: 28

None of the other clients connecting to their servers do "(PTRIN(args->from) != NULL)" though, so there is no output, and why they are all working without the workaround.
Comment 14 Dmitry Chagin freebsd_committer freebsd_triage 2022-03-30 16:30:08 UTC
(In reply to Jason Mader from comment #11)

hi, could you please show how socket created, ie find string linux_socket(..)
I'm interested on domain, type and proto values, as not all of proto have PR_ADDR flag
Comment 15 Jason Mader 2022-03-30 16:48:03 UTC
(In reply to Dmitry Chagin from comment #14)
Hopefully this is what you're looking for,
 69131 rlm      CALL  linux_socket(0x1,0x80801,0)
 69131 rlm      RET   linux_socket 3
 69131 rlm      CALL  linux_socket(0x1,0x80801,0)
 69131 rlm      RET   linux_socket 3
 69131 rlm      CALL  linux_socket(0xa,0x1,0)
 69131 rlm      RET   linux_socket -1 errno -93 Protocol not supported
 69131 rlm      CALL  linux_socket(0x2,0x1,0)
 69131 rlm      RET   linux_socket 3
 69131 rlm      CALL  linux_socket(0xa,0x1,0)
 69131 rlm      RET   linux_socket -1 errno -93 Protocol not supported
 69131 rlm      CALL  linux_socket(0x2,0x1,0)
 69131 rlm      RET   linux_socket 4
Comment 16 Dmitry Chagin freebsd_committer freebsd_triage 2022-03-30 16:57:09 UTC
(In reply to Jason Mader from comment #15)
yes, last call to linux_socket, ie

 69131 rlm      CALL  linux_socket(0x2,0x1,0)
 69131 rlm      RET   linux_socket 4

and 

 35514 rlm      CALL  linux_recvfrom(0x4,0x85d914,0x6,0x4000,0x861c7c,0x7fffffffddf0)
 35514 rlm      GIO   fd 4 read 6 bytes
       0x0000 0100 b900 fdb7                                                                                       |......|

 35514 rlm      RET   linux_recvfrom -1 errno -22 Invalid argument

means that connect-oriented protocol is is used, so msg.msg_namelen is set to 0.
I'll prepare a simple patch soon
Comment 17 Dmitry Chagin freebsd_committer freebsd_triage 2022-03-30 17:18:53 UTC
Created attachment 232824 [details]
prevent copying of uninitialized source address

try this one please
Comment 18 commit-hook freebsd_committer freebsd_triage 2022-04-11 20:30:39 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=bb46e9b5107fd8763742f7e55b66ea2e574f5815

commit bb46e9b5107fd8763742f7e55b66ea2e574f5815
Author:     Dmitry Chagin <dchagin@FreeBSD.org>
AuthorDate: 2022-04-11 20:29:45 +0000
Commit:     Dmitry Chagin <dchagin@FreeBSD.org>
CommitDate: 2022-04-11 20:29:45 +0000

    linux(4): Prevent an attempt to copy an uninitialized source address.

    PR:                     259380
    MFC after:              3 days

 sys/compat/linux/linux_socket.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)
Comment 19 commit-hook freebsd_committer freebsd_triage 2022-06-17 19:42:18 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=d33fba34ab195d7d8bb5a8daabdad39ec7d9f4c4

commit d33fba34ab195d7d8bb5a8daabdad39ec7d9f4c4
Author:     Dmitry Chagin <dchagin@FreeBSD.org>
AuthorDate: 2022-04-11 20:29:45 +0000
Commit:     Dmitry Chagin <dchagin@FreeBSD.org>
CommitDate: 2022-06-17 19:33:50 +0000

    linux(4): Prevent an attempt to copy an uninitialized source address.

    PR:                     259380
    MFC after:              3 days

    (cherry picked from commit bb46e9b5107fd8763742f7e55b66ea2e574f5815)

 sys/compat/linux/linux_socket.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)