Bug 260116 - [sctp] POLLOUT/EVFILT_WRITE is always true for poll/kqueue
Summary: [sctp] POLLOUT/EVFILT_WRITE is always true for poll/kqueue
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.0-RELEASE
Hardware: Any Any
: --- Affects Some People
Assignee: Michael Tuexen
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-11-29 13:35 UTC by Albin Hellqvist
Modified: 2024-01-11 13:04 UTC (History)
3 users (show)

See Also:
tuexen: mfc-stable13+


Attachments
Test files for reproducing the bug (4.31 KB, application/x-zip-compressed)
2021-11-29 13:37 UTC, Albin Hellqvist
no flags Details
Proposed patch (6.67 KB, patch)
2022-01-25 11:09 UTC, Björn Svensson
no flags Details | Diff
Proposed patch v2 (4.82 KB, patch)
2022-02-08 17:47 UTC, Björn Svensson
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Albin Hellqvist 2021-11-29 13:35:11 UTC
It seems like POLLOUT/EVFILT_WRITE always is true for SCTP sockets even if the socket send buffer is full.

Besides this, it should also account for SO_SNDLOWAT but it is possible that we will get this for "free" since it looks like the functionality already is there.

Based on some experiments (which might be faulty), SCTP sockets doesn't use struct sockbuf in the same way as TCP uses it. A lot of the fields in that struct are 0 for SCTP sockets since it also has its own structs.

This seems to affect filt_sowrite() in (https://github.com/freebsd/freebsd-src/blob/main/sys/kern/uipc_socket.c). More specifically, this line:
kn->kn_data = sbspace(&so->so_snd);

I have tested to replace that line with:
kn->kn_data = (so->so_snd.sb_hiwat - so->so_snd.sb_acc);

Which seems to do the trick, but I expect this to have side effects.
Comment 1 Albin Hellqvist 2021-11-29 13:37:32 UTC
Created attachment 229791 [details]
Test files for reproducing the bug

I added some tests if it helps. They can be compiled using:
g++ -o client client.cpp utils.cpp
g++ -o server server.cpp utils.cpp

I have tested it in VMs with the SCTP module enabled. I have also added alias addresses to the loopback interface so that I could run both the server and client over the loopback interface.

These alias addresess were added:
127.0.0.2
127.0.0.3
127.0.0.4
127.0.0.5
127.0.0.6
127.0.0.7
::2
::3
::4
::5
::6
::7

The test can be started using:
./server --poll --ipv4 --sctp --test-pollout
./client --poll --ipv4 --sctp --test-pollout

Or with kqueue:
./client --kqueue --ipv4 --sctp --test-pollout

IPv6 as well as TCP can be tested as well to compare the behavior.
Comment 2 Björn Svensson 2022-01-24 11:31:14 UTC
After some investigations we have seen that the introduction of
"sendfile(2) system call' required some changes in the struct sockbuf.
The field sb_cc was split into sb_ccc and sb_acc in commit: 0f9d0a73a495

For SCTP a macro was introduced to handle this change via commit: 4e88d37a2a73

The change of using sb_ccc instead of sb_cc unfortunately seemed to trigger problems with SCTP sockets and select() so it was changed to use sb_acc instead via commit: 975c975bf0f1


The problem we now see is that when POLLOUT/EVFILT_WRITE are handled in
sopoll_generic() and filt_sowrite() they call sbspace():

https://github.com/freebsd/freebsd-src/blob/main/sys/kern/uipc_socket.c#L3952
https://github.com/freebsd/freebsd-src/blob/main/sys/kern/uipc_socket.c#L3630
https://github.com/freebsd/freebsd-src/blob/main/sys/sys/socketvar.h#L325

but sbspace() still uses sb->sb_ccc when calculating the space.

https://github.com/freebsd/freebsd-src/blob/main/sys/sys/sockbuf.h#L239

This results in that the socket always seems writeable even when the sendbuffer is full.

Some ideas to solve this would be to:

- Change sbspace() to use sb_acc when sb_ccc is zero.
  * This might be a bit hacky
  * This might affect other users of the function.

- Update both sb_acc and sb_ccc in SCTP
  Since SCTP already updates sb_acc to fit into the framework we can 
  double book the value to sb_ccc as well. The sb_acc field should be the
  owner of the information, but when changed it's copied to sb_ccc.
  * This only affects the SCTP code

Any thoughts or other ideas?

I will create a patchset for updating both sb_ccc and sb_acc and test this idea.
Comment 3 Björn Svensson 2022-01-25 11:09:13 UTC
Created attachment 231307 [details]
Proposed patch

This proposed patch adds a macro for unifying the sb_acc and sb_ccc counters after the value has been updated.
The macro can be changed for upstream users (like a userspace sctp stack) in a similar way as for sb_cc, i.e. it can be disabled when there is no usage of sb_ccc and sb_acc.
See patch for more details.
Comment 4 Michael Tuexen freebsd_committer freebsd_triage 2022-01-25 11:37:34 UTC
(In reply to Björn Svensson from comment #3)
Can you state what you are trying to fix? Are you trying to fix sendfile() for SCTP or the kqueue write stuff or something else?
Comment 5 Michael Tuexen freebsd_committer freebsd_triage 2022-01-25 11:37:43 UTC
(In reply to Björn Svensson from comment #3)
Can you state what you are trying to fix? Are you trying to fix sendfile() for SCTP or the kqueue write stuff or something else?
Comment 6 Björn Svensson 2022-01-25 12:22:47 UTC
(In reply to Michael Tuexen from comment #5)
I'm trying to fix the problem that a sctp socket is not indicating that the sendbuffer is full when polling the socket, or via kernel event notifications.

The finding is that both counters sb_ccc and sb_acc is needed to be updated in a socketbuffer since they have different users in the framework/system.

sb_ccc is checked when polling the socket, but only sb_acc is updated.
There is a comment at the sb_cc define that it needs to represent sb_acc since otherwise select() is not working.

So the idea was to update both counters when the buffer is changed.
Comment 7 Björn Svensson 2022-02-08 17:47:03 UTC
Created attachment 231650 [details]
Proposed patch v2

Updated the proposed patch; removed changes that was found to not be required.

The patch (still) adds a new macro which makes sure that a sockets sendbuffer counters sb_ccc and sb_acc are both updated for 1-to-1 style sockets.

Both counters are needed when the socket API is used with poll()/kevent() since the sb_ccc counter is used to determine if the sendbuffer is full, and the sb_acc counter in other cases.

This will correct the problem that an 1-to-1 style SCTP socket always is indicated to be writeable.
Comment 8 Björn Svensson 2022-02-08 18:30:50 UTC
I have noted that the non-blocking section (https://datatracker.ietf.org/doc/html/rfc6458#section-3.2) mention that an application should use caution when writing to a 1-to-many style socket and using select()/poll(). The handling of the sendbuffer is implementation specific.

For 1-to-1 style sockets I can't see such warning in https://datatracker.ietf.org/doc/html/rfc6458#section-4, but maybe I have missed something.
Looking at the git history also indicates that this issue should have existed for many years without any mention of it by users on mailinglists, which makes me wonder.

Is the assumption that we should be able to use poll() on a 1-to-1 style socket to get sendbuffer full indications maybe wrong?
Comment 9 Michael Tuexen freebsd_committer freebsd_triage 2022-02-08 18:39:07 UTC
(In reply to Björn Svensson from comment #8)
I think for 1-to-1 style sockets it should work and it worked (I think) until the socket layer in FreeBSD was changed. I'll look into the patch soon.
Comment 10 commit-hook freebsd_committer freebsd_triage 2022-05-15 20:45:09 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=edc5b6ea881d7e196fee8df7ebcd372f8f5b4469

commit edc5b6ea881d7e196fee8df7ebcd372f8f5b4469
Author:     Michael Tuexen <tuexen@FreeBSD.org>
AuthorDate: 2022-05-14 10:38:43 +0000
Commit:     Michael Tuexen <tuexen@FreeBSD.org>
CommitDate: 2022-05-14 10:38:43 +0000

    sctp: use sb_avail() when accessing sb_acc for reading

    This is a cleanup to simplify a patch for PR 260116.

    PR:             260116
    MFC after:      3 days

 sys/netinet/sctp_indata.c |  2 +-
 sys/netinet/sctp_os_bsd.h |  2 ++
 sys/netinet/sctp_output.c |  2 +-
 sys/netinet/sctp_pcb.c    |  2 +-
 sys/netinet/sctp_usrreq.c |  7 +++----
 sys/netinet/sctputil.c    | 30 +++++++++++++++---------------
 6 files changed, 23 insertions(+), 22 deletions(-)
Comment 11 commit-hook freebsd_committer freebsd_triage 2023-02-01 22:57:39 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=a1742f61f4b79c84eb36ab12d83e577f714551d8

commit a1742f61f4b79c84eb36ab12d83e577f714551d8
Author:     Michael Tuexen <tuexen@FreeBSD.org>
AuthorDate: 2022-05-14 10:38:43 +0000
Commit:     Michael Tuexen <tuexen@FreeBSD.org>
CommitDate: 2023-02-01 22:56:33 +0000

    sctp: use sb_avail() when accessing sb_acc for reading

    This is a cleanup to simplify a patch for PR 260116.

    PR:             260116

    (cherry picked from commit edc5b6ea881d7e196fee8df7ebcd372f8f5b4469)

 sys/netinet/sctp_indata.c |  2 +-
 sys/netinet/sctp_os_bsd.h |  2 ++
 sys/netinet/sctp_output.c |  2 +-
 sys/netinet/sctp_pcb.c    |  2 +-
 sys/netinet/sctp_usrreq.c |  7 +++----
 sys/netinet/sctputil.c    | 30 +++++++++++++++---------------
 6 files changed, 23 insertions(+), 22 deletions(-)
Comment 12 commit-hook freebsd_committer freebsd_triage 2023-07-28 22:03:54 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=b279e84a47ddb59e55b5a3cec31c51cd41bf0dc3

commit b279e84a47ddb59e55b5a3cec31c51cd41bf0dc3
Author:     Michael Tuexen <tuexen@FreeBSD.org>
AuthorDate: 2023-07-28 12:36:11 +0000
Commit:     Michael Tuexen <tuexen@FreeBSD.org>
CommitDate: 2023-07-28 12:36:11 +0000

    sctp: improve consistency

    This is simplifying a patch to address PR 260116.

    PR:             260116
    MFC after:      1 week

 sys/netinet/sctputil.c | 8 ++------
 sys/netinet/sctputil.h | 6 +-----
 2 files changed, 3 insertions(+), 11 deletions(-)
Comment 13 commit-hook freebsd_committer freebsd_triage 2023-07-28 22:17:57 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=c620788150d274c09a070ab486602c98407d73b0

commit c620788150d274c09a070ab486602c98407d73b0
Author:     Michael Tuexen <tuexen@FreeBSD.org>
AuthorDate: 2023-07-28 13:16:23 +0000
Commit:     Michael Tuexen <tuexen@FreeBSD.org>
CommitDate: 2023-07-28 13:16:23 +0000

    sctp: keep sb_acc and sb_ccc in sync

    PR:             260116
    MFC after:      1 week

 sys/netinet/sctp_os_bsd.h | 18 ++++++++++++------
 sys/netinet/sctp_output.c |  2 +-
 sys/netinet/sctp_pcb.c    |  2 +-
 sys/netinet/sctp_var.h    |  4 ++--
 sys/netinet/sctputil.c    |  4 ++--
 sys/netinet/sctputil.h    |  4 ++--
 6 files changed, 20 insertions(+), 14 deletions(-)
Comment 14 Björn Svensson 2023-08-10 08:39:22 UTC
I have tested your last changes by cherry picking them to stable/13 and it works great! Thanks for your fix!
Comment 15 Ed Maste freebsd_committer freebsd_triage 2023-12-16 16:49:22 UTC
It looks like these changes should be cherry-picked into stable/13 still (and this PR ought to remain open until then).
Comment 16 Michael Tuexen freebsd_committer freebsd_triage 2023-12-16 19:05:28 UTC
(In reply to Ed Maste from comment #15)
That was the plan...
Comment 17 Albin Hellqvist 2023-12-17 12:25:29 UTC
(In reply to Michael Tuexen from comment #16)
Oh, sorry, my bad!
Comment 18 Michael Tuexen freebsd_committer freebsd_triage 2023-12-17 12:35:44 UTC
(In reply to Albin from comment #17)
No problem. You actually reminded me to MFC the change.
Comment 19 commit-hook freebsd_committer freebsd_triage 2024-01-11 12:19:57 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=6c9b92e741f0f878deff8ad3553239155c517a7a

commit 6c9b92e741f0f878deff8ad3553239155c517a7a
Author:     Michael Tuexen <tuexen@FreeBSD.org>
AuthorDate: 2023-07-28 12:36:11 +0000
Commit:     Michael Tuexen <tuexen@FreeBSD.org>
CommitDate: 2024-01-11 12:18:37 +0000

    sctp: improve consistency

    This is simplifying a patch to address PR 260116.

    PR:             260116

    (cherry picked from commit b279e84a47ddb59e55b5a3cec31c51cd41bf0dc3)

 sys/netinet/sctputil.c | 8 ++------
 sys/netinet/sctputil.h | 6 +-----
 2 files changed, 3 insertions(+), 11 deletions(-)
Comment 20 commit-hook freebsd_committer freebsd_triage 2024-01-11 12:21:59 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=f141ff9935fde0ab9790a7687605bf3e455d1300

commit f141ff9935fde0ab9790a7687605bf3e455d1300
Author:     Michael Tuexen <tuexen@FreeBSD.org>
AuthorDate: 2023-07-28 13:16:23 +0000
Commit:     Michael Tuexen <tuexen@FreeBSD.org>
CommitDate: 2024-01-11 12:20:06 +0000

    sctp: keep sb_acc and sb_ccc in sync

    PR:             260116

    (cherry picked from commit c620788150d274c09a070ab486602c98407d73b0)

 sys/netinet/sctp_os_bsd.h | 18 ++++++++++++------
 sys/netinet/sctp_output.c |  2 +-
 sys/netinet/sctp_pcb.c    |  2 +-
 sys/netinet/sctp_var.h    |  4 ++--
 sys/netinet/sctputil.c    |  4 ++--
 sys/netinet/sctputil.h    |  4 ++--
 6 files changed, 20 insertions(+), 14 deletions(-)