Bug 191586

Summary: FreeBSD doesn't validate negative edgecases in bind(2)/connect(2)/listen(2) like POSIX requires
Product: Base System Reporter: Enji Cooper <ngie>
Component: standardsAssignee: freebsd-standards (Nobody) <standards>
Status: Closed FIXED    
Severity: Affects Some People CC: emaste, kevlo, lambert.tr, markj, ngie
Priority: --- Flags: bugmeister: mfc-stable10?
bugmeister: mfc-stable9?
bugmeister: mfc-stable8?
Version: CURRENT   
Hardware: Any   
OS: Any   
See Also: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=191382

Description Enji Cooper freebsd_committer freebsd_triage 2014-07-03 19:03:35 UTC
POSIX specifies certain requirements for validating sa_family with bind(2) and connect(2):

From bind(2):

[EAFNOSUPPORT]
    The specified address is not a valid address for the address family of the specified socket.

...

[EINVAL]
    The address_len argument is not a valid length for the address family.

From connect [2]:

[EINVAL]
    The address_len argument is not a valid length for the address family; or invalid address family in the sockaddr structure.

From listen [3]:

[EDESTADDRREQ]
    The socket is not bound to a local address, and the protocol does not support listening on an unbound socket.

FreeBSD 10/stable does not appear to be conforming to POSIX:
- bind(2) doesn't appear to be validating the sa_family member at all (it doesn't set errno to either EAFNOSUPPORT or EINVAL).
- connect(2)'s not handling EINVAL properly, but the issue is being caught with EAFNOSUPPORT .
- listen(2)'s not handling EDESTADDRREQ.

I haven't tested out a kernel built INET, INET6, or SCTP support compiled out to ensure that these cases would be caught properly.

1. http://pubs.opengroup.org/onlinepubs/009695399/functions/bind.html
2. http://pubs.opengroup.org/onlinepubs/009695399/functions/connect.html
3. http://pubs.opengroup.org/onlinepubs/009695399/functions/listen.html

% uname -a
FreeBSD freebsd-10-x64.localdomain 10.0-STABLE FreeBSD 10.0-STABLE #2 r267849+810c962(stable/10): Wed Jun 25 14:23:41 PDT 2014     root@freebsd-10-x64.localdomain:/usr/obj/usr/src/sys/GENERIC-WITHOUT-WITNESS  amd64
% sysctl -Na | grep -q sctp && echo "has SCTP"
has SCTP
% sysctl kern.features. | grep 'inet'
kern.features.inet: 1
kern.features.inet6: 1
% cat /root/test_bogus_sa_family.c
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <err.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int
main(void)
{
        struct sockaddr_in sai;
        int sock;

        sock = socket(AF_INET, SOCK_STREAM, 0);

        memset(&sai, 0, sizeof(sai));
        memset(&sai, 0, sizeof(sai));

        sai.sin_family = AF_MAX + 10;
        sai.sin_addr.s_addr = INADDR_ANY;
        sai.sin_port = 42000;

        if (bind(sock, (struct sockaddr*)&sai, sizeof(sai)) == 0) {
                warnx("bind did not fail with invalid sin_family as expected");
                if (listen(sock, 1) == 0) {
                        warnx("and is now listening for connections on an invalid address family!? (netstat output follows)");
                        system("netstat -a | grep LISTEN");
                } else
                        warn("but did fail when listen was called");
        } else if (errno != EAFNOSUPPORT)
                warn("bind did not fail with EAFNOSUPPORT");
        else
                printf("bind test passed");

        close(sock);

        sock = socket(AF_INET, SOCK_STREAM, 0);

        if (connect(sock, (struct sockaddr*)&sai, sizeof(sai)) == 0)
                warnx("connect did not fail with invalid sin_family as expected");
        else if (errno != EINVAL)
                warn("connect did not fail with EINVAL");
        else
                printf("connect test passed");

        close(sock);

        return (0);
}
% /root/test_bogus_sa_family
test_bogus_sa_family: bind did not fail with invalid sin_family as expected
test_bogus_sa_family: and is now listening for connections on an invalid address family!? (netstat output follows)
tcp4       0      0 *.4260                 *.*                    LISTEN
tcp4       0      0 localhost.smtp         *.*                    LISTEN
tcp4       0      0 *.ssh                  *.*                    LISTEN
tcp6       0      0 *.ssh                  *.*                    LISTEN
tcp6       0      0 *.nfsd                 *.*                    LISTEN
tcp4       0      0 *.nfsd                 *.*                    LISTEN
tcp4       0      0 *.mecomm               *.*                    LISTEN
tcp6       0      0 *.mecomm               *.*                    LISTEN
tcp4       0      0 *.sunrpc               *.*                    LISTEN
tcp6       0      0 *.sunrpc               *.*                    LISTEN
test_bogus_sa_family: connect did not fail with EINVAL: Address family not supported by protocol family
Comment 1 Enji Cooper freebsd_committer freebsd_triage 2014-07-03 19:04:11 UTC
(In reply to yaneurabeya from comment #0)
> POSIX specifies certain requirements for validating sa_family with bind(2)
> and connect(2):

...

and connect(2) -> connect(2), and listen(2).
Comment 2 Terry Lambert 2014-07-03 22:01:39 UTC
Some valid statements, some invalid.  You would have to configure the VSX4 tests correctly to expect the results that you'd get, but some of these are optional implement, while still being conformant.

For the interfaces in question, the relevant documents are:
http://pubs.opengroup.org/onlinepubs/009695399/functions/bind.html
http://pubs.opengroup.org/onlinepubs/009695399/functions/connect.html
http://pubs.opengroup.org/onlinepubs/009695399/functions/listen.html


Comments are by API:

bind(2):

The EAFNOSUPPORT is non-optional; however, the test code is bogus, in that it has to specify an existing address family, such as AF_INET, rather than a potentially loaded/pluggable address family.  Because this is a negative assertion test, it needs to hit on something that's actually guaranteed to be there, such as AF_INET r AF_UNIX.

The EINVAL in bind(2) is an optional error return: "The bind() function may fail if..."; this indicates that bounds checking of the length is not a requirement.  The rationale for this is to allow a larger-than-needed buffer to be used for a sockaddr and take it to a sockaddr_in as a void value, if needed, such that different values can be used (this is derived from the SVID III definition for the TLI implementation of separation of naming, from which the POSIX tests are originally derived).  You could (potentially) make a case for bounds checking for a known address family (not pluggable) on the basis of the decode of the sin_family/sin_addr tuple, but the standard does not require it.


connect(2):

The EINVAL is similarly an optional error return: "The connect() function may fail if"... same base rationale.

listen(2):

The EDESTADDRREQ is similarly non-optional; again, however, the test is bogus for its use of an out of range/undefined address family definition.

---

NB:

I believe Andrew and Neil would be open to giving a select group of FreeBSD developers access to the actual test suite, perhaps later this year, assuming additional discussion and closed access to the test suite to a select group.

It should very much be noted that in any conflict between the test suite and the standard, the test suite is assumed to be more correct than the actual standard, unless you file a TSD (Test Suite Deficiency) report, and The Open Group and the Austin Group agree that a test suite change is warranted by the rationale for the report.
Comment 3 Kevin Lo freebsd_committer freebsd_triage 2014-07-04 16:19:24 UTC
(In reply to Terry Lambert from comment #2)
> Some valid statements, some invalid.  You would have to configure the VSX4
> tests correctly to expect the results that you'd get, but some of these are
> optional implement, while still being conformant.
> 
> For the interfaces in question, the relevant documents are:
> http://pubs.opengroup.org/onlinepubs/009695399/functions/bind.html
> http://pubs.opengroup.org/onlinepubs/009695399/functions/connect.html
> http://pubs.opengroup.org/onlinepubs/009695399/functions/listen.html
> 
> 
> Comments are by API:
> 
> bind(2):
> 
> The EAFNOSUPPORT is non-optional; however, the test code is bogus, in that
> it has to specify an existing address family, such as AF_INET, rather than a
> potentially loaded/pluggable address family.  Because this is a negative
> assertion test, it needs to hit on something that's actually guaranteed to
> be there, such as AF_INET r AF_UNIX.

[snip]

Here's the diff that enables check family for bind(2), ok?

Index: sys/netinet/in_pcb.c
===================================================================
--- sys/netinet/in_pcb.c        (revision 268261)
+++ sys/netinet/in_pcb.c        (working copy)
@@ -528,14 +528,11 @@ in_pcbbind_setup(struct inpcb *inp, struct sockadd
                sin = (struct sockaddr_in *)nam;
                if (nam->sa_len != sizeof (*sin))
                        return (EINVAL);
-#ifdef notdef
                /*
-                * We should check the family, but old programs
-                * incorrectly fail to initialize it.
+                * Family check.
                 */
                if (sin->sin_family != AF_INET)
                        return (EAFNOSUPPORT);
-#endif
                error = prison_local_ip4(cred, &sin->sin_addr);
                if (error)
                        return (error);
Comment 4 Terry Lambert 2014-07-04 17:04:56 UTC
Since this isn't a "MAY", this is a "SHALL", I still *totally agree* with you that it's a missing mandatory check, and can't be ignored, but this is not necessarily the proper place for it, without the side effect testing.

--

It's reasonable, but it would need some better testing.  My initial objection to a "just make it all work" was only based on the unit test not testing what it claimed it was testing.

--

Extra unit testing would be needed because doing the test in in_pcbbind_setup() like this also impacts other callers besides in_pcbbind():

    in_pcbbind()
    in_pcbconnect_setup(()

and, as a consequence, their callers:

    tcp_connect() <- this is where you want it to hit
    udp_output()
    udp6_bind()
    div_bind() <- specifically references that it will be passed a valid address

...and any other third party connection oriented protocols layered on top of IP.

I also think that POSIX is not specifically concerned with AF_INET, so doing the check at this specific location might be simultaneously too narrow in scope.

It's arguable that the check should actually be happening in a higher level protocol family function, or even based on protocol family metadata at one higher level than that, where the dispatch to the protocol family via AF_INET as the value got it into the protocol family code in the first place.

In other words, the upper layer bind/connect code prior to the protocol dispatch, based on a "check address family first" bit in the protocol family descriptor.

A valid argument against that approach might be that this would add general code path overhead for protocols that don't care/don't want the check by adding a single bit check and an interior dereference to get the flag bit and compare family, but it'd let you throw the right error for "AF_MAX + 10" EAFNOSUPPORT, which the code doesn't do, before or after the patch, in some circumstances.

FFR:

One of the reasons Mac OS X ignored optional ("MAY RETURN") error codes where they were not already implemented when we did the POSIX conformance certification was that overhead.

A second reason we didn't add some simple checks we could have added was due to the fact that the behaviour was tied between APIs for the VSX test suite, i.e. you couldn't pick one "MAY" "here" and not pick it "there", and it can take 3-6 months to argue a test suite deficiency, and it's often more important to be compliant than right.

NB:

Sorry to be something of a downer on this; I was tech load on the Mac OS X POSIX conformance for Apple for the entire project, and there's a really narrow tight rope you have to walk between making as few changes as possible, and as many as absolutely necessary.

I'm in the middle of moving my house right now, but I'm pretty sure I can be more genuinely helpful with actual code changes some time after that.
Comment 5 Kevin Lo freebsd_committer freebsd_triage 2014-07-06 15:20:30 UTC
(In reply to Terry Lambert from comment #4)
> Since this isn't a "MAY", this is a "SHALL", I still *totally agree* with
> you that it's a missing mandatory check, and can't be ignored, but this is
> not necessarily the proper place for it, without the side effect testing.
> 
> --
> 
> It's reasonable, but it would need some better testing.  My initial
> objection to a "just make it all work" was only based on the unit test not
> testing what it claimed it was testing.
> 
> --
> 
> Extra unit testing would be needed because doing the test in
> in_pcbbind_setup() like this also impacts other callers besides in_pcbbind():
> 
>     in_pcbbind()
>     in_pcbconnect_setup(()
> 
> and, as a consequence, their callers:
> 
>     tcp_connect() <- this is where you want it to hit
>     udp_output()
>     udp6_bind()
>     div_bind() <- specifically references that it will be passed a valid
> address
> 
> ...and any other third party connection oriented protocols layered on top of
> IP.
> 
> I also think that POSIX is not specifically concerned with AF_INET, so doing
> the check at this specific location might be simultaneously too narrow in
> scope.
> 
> It's arguable that the check should actually be happening in a higher level
> protocol family function, or even based on protocol family metadata at one
> higher level than that, where the dispatch to the protocol family via
> AF_INET as the value got it into the protocol family code in the first place.
> 
> In other words, the upper layer bind/connect code prior to the protocol
> dispatch, based on a "check address family first" bit in the protocol family
> descriptor.
> 
> A valid argument against that approach might be that this would add general
> code path overhead for protocols that don't care/don't want the check by
> adding a single bit check and an interior dereference to get the flag bit
> and compare family, but it'd let you throw the right error for "AF_MAX + 10"
> EAFNOSUPPORT, which the code doesn't do, before or after the patch, in some
> circumstances.

In case you missed it, the indentical test in the in6_pcbbind() function 
is performed.  The same check in in_pcbconnect_setup() and in6_pcbladdr()
are also actived.  As comments say if old programs incorrectly fail to 
initialize it, they will return EAFNOSUPPORT.
Comment 6 Terry Lambert 2014-07-07 06:33:10 UTC
I didn't miss it.

The test is still testing AF_MAX + 10, and assuming the existence of a particular set of AF's that are not mandated by POSIX, in order to provide the negative assertion.

I believe the actual VSX4 tests use AF_UNIX, per the "shall define" description for <sys/socket.h>.  Given the #ifdef's to allow condition deletion of AF_INET/AF_INET6 support, this makes more sense anyway.

I also believe the specific failure case was added to the error messages because of the unique nature of the Linux TCP implementation regarding the "simultaneous connect" case in section 3.4 of RFC 793, which Linux handles poorly.

I still believe that the test should be done in upper level code, and yes, I am aware that that would mean adding fields to the protocol family structure to allw it to be done in a protocol independent fashion.

Look at it this way: if I add XNS, RTSP, or some other protocol support, it's going to fail this negative assertion test when it shouldn't, since the API is supposed to be protocol agnostic.  Does that seem right to you?
Comment 7 Kevin Lo freebsd_committer freebsd_triage 2014-07-07 07:28:02 UTC
Nah, it doesn't make sense to me either.

According to TCP/IP Illustrated, Vol 2, section 22.7, figure 22.22:

72      if (nam) {
73              sin = mtod(nam, struct sockaddr_in *);
74              if (nam->m_len != sizeof(*sin)) 
75                      return (EINVAL);
76 #ifdef notdef
77              /*
78               * We should check the family, but old programs
79               * incorrectly fail to initialize it.
80               */
81              if (sin->sin_family != AF_INET)
82                      return (EAFNOSUPPORT);
83 #endif
84              lport = sin->sin_port;  /* might be 0 */

76 - 83
The test for the correct address family is commented out, yet the identical
test in the in_pcbconnect function is performed. We expect either both 
to be in or both to be out.

Do you think both functions shouldn't perform the check?  Thanks.
Comment 8 Terry Lambert 2014-07-07 17:11:06 UTC
That's hard for me to answer without the VSX4 test sources in front of me.

I will say that Mac OS X passes the tests, and that Mac OS X comments the test out, but without a lot of looking at upper level code, I'm still pretty sure they don't do the meta check I described at a higher level.  Vincent Lubet (still networking manager at Apple) and more likely, Laurent Dumont (currently at Akamai) would be able to tell you for sure without groveling through code.

The current Apple code with it "#if 0"'ed is visible here, FWIW:

http://opensource.apple.com/source/xnu/xnu-2422.90.20/bsd/netinet/in_pcb.c

There was no TSD (Test Suite Deficiency) or PIN (Permanent Interpretation) in this specific area during the Mac OS X UNIX certification process, so no test failure.


I suspect that this is an area where the standard says what the people on the Austin Group who are representing Linux via IBM want it to say, and that the actual tests simply don't test it at all because it would conflict with existing implementations grandfathered under SVR3/SVR4 derivation rules.
Comment 9 Kevin Lo freebsd_committer freebsd_triage 2014-07-12 16:39:01 UTC
Hi Terry,

As you suggested, I asked Vincent Lubet how Mac OS X validates EAFNOSUPPORT
in bind(2).  Here is his response. 

"xnu version of bind() does not check the address family for AF_INET sockets 
 for compatibility with older program -- that's a piece of code we inherited 
 from FreeBSD!

 I do not have access to the POSIX test suite code but as Mac OS X was 
 granted conformance I have to assume the POSIX test suite for bind()
 does not test for bogus address family for AF_INET sockets. May be they
 only test for AF_UNIX."

Since Mac OS X was granted POSIX conformance, it makes more sense to me
to validate EAFNOSUPPORT in bind(2) for AF_UNIX only.
I think connect(2) should also return EAFNOSUPPORT for AF_UNIX on wrong
address family.

Here is the proposed patch, thanks

Index: sys/kern/uipc_usrreq.c
===================================================================
--- sys/kern/uipc_usrreq.c	(revision 268570)
+++ sys/kern/uipc_usrreq.c	(working copy)
@@ -467,6 +467,9 @@ uipc_bindat(int fd, struct socket *so, struct sock
 	cap_rights_t rights;
 	char *buf;
 
+	if (nam->sa_family != AF_UNIX)
+		return (EAFNOSUPPORT);
+
 	unp = sotounpcb(so);
 	KASSERT(unp != NULL, ("uipc_bind: unp == NULL"));
 
@@ -1278,6 +1281,9 @@ unp_connectat(int fd, struct socket *so, struct so
 	cap_rights_t rights;
 	int error, len;
 
+	if (nam->sa_family != AF_UNIX)
+		return (EAFNOSUPPORT);
+
 	UNP_LINK_WLOCK_ASSERT();
 
 	unp = sotounpcb(so);
Comment 10 Terry Lambert 2014-07-14 05:08:08 UTC
This seems like a great way of handling it.

It might be reasonable to not in the commit message about what's going on (not sure if a bugzilla link would be a reasonable thing, but if so, that's probably enough).
Comment 11 Kevin Lo freebsd_committer freebsd_triage 2014-07-14 06:04:47 UTC
Committed as r268601.  Thank you Terry.
Comment 12 Ed Maste freebsd_committer freebsd_triage 2018-10-10 14:00:16 UTC
r268601 is Make bind(2) and connect(2) return EAFNOSUPPORT for AF_UNIX on wrong address family.
https://reviews.freebsd.org/rS268601

Is there more to do or is this change sufficient?
Comment 13 Enji Cooper freebsd_committer freebsd_triage 2018-10-12 17:50:37 UTC
(In reply to Ed Maste from comment #12)

That is a good question. I will put this down as a reminder to look into in the next week or so.
Comment 14 Mark Johnston freebsd_committer freebsd_triage 2021-06-15 19:34:49 UTC
I don't think there is anything left to do here, so I'll preemptively close the PR.  Please feel free to re-open if necessary.

On top of the commited patch, we recently added more validation for sa_family and sa_len in various protocols.  This was because missing sa_len checks allowed programs to trigger out-of-bounds accesses in the kernel when examining protocol-specific sockaddr fields.  As a part of that, I moved checks out of the inpcb code and into specific procotol implementations.

In comment 4 we have:
-----
It's arguable that the check should actually be happening in a higher level protocol family function, or even based on protocol family metadata at one higher level than that, where the dispatch to the protocol family via AF_INET as the value got it into the protocol family code in the first place.

In other words, the upper layer bind/connect code prior to the protocol dispatch, based on a "check address family first" bit in the protocol family descriptor.
-----
which I agree with, but as I understand would be an enhancement rather than a bug fix.

We still deviate from POSIX in at least a couple of ways.  First, protocol implementations for connect(2) return ENOAFSUPPORT rather than EINVAL if the address family doesn't match what they expect.  I believe that is consistent with MacOS and Linux.  Second, the TCP and UDP implementations permit bind(2) on AF_INET sockets when sa_family == AF_UNSPEC and sin_addr == INADDR_ANY.  In other words, the check referenced in comment 7 is now enabled (albeit at a different layer), with this one exception required to avoid breaking some popular software, notably ttcp.