Bug 227259 - accept()/poll() and shutdown()/close() - not work as in FreeBSD10, may broke many apps
Summary: accept()/poll() and shutdown()/close() - not work as in FreeBSD10, may broke ...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.1-STABLE
Hardware: Any Any
: --- Affects Many People
Assignee: Jonathan T. Looney
URL:
Keywords: needs-qa, patch, regression
: 225863 (view as bug list)
Depends on:
Blocks:
 
Reported: 2018-04-03 15:50 UTC by rozhuk.im
Modified: 2018-10-04 18:17 UTC (History)
11 users (show)

See Also:
jtl: mfc-stable11+


Attachments
acc_test.c (5.54 KB, text/plain)
2018-04-03 15:50 UTC, rozhuk.im
no flags Details
test tool (13.46 KB, text/plain)
2018-04-04 12:44 UTC, rozhuk.im
no flags Details
test tool (14.26 KB, text/plain)
2018-04-05 22:22 UTC, rozhuk.im
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description rozhuk.im 2018-04-03 15:50:16 UTC
Created attachment 192171 [details]
acc_test.c

We have some legacy app that create socket, create thread and in new thread call accept() with this socket.
Main thread wait for signals and do shutdown() + close() for socket, then wait until accept thread wakeup and exit, then app exit/create new socket+thread.

On FreeBSD 9, 10 - OK.
On FreeBSD 11 - thread does not wakeup on shutdown() and close().

FreeBSD 10:
cc acc_test.c -O0 -DDEBUG -DACCT_CLOSE -pthread -o acc_test - HANG
cc acc_test.c -O0 -DDEBUG -DACCT_CLOSE -DACCT_SELECT -pthread -o acc_test - HANG
cc acc_test.c -O0 -DDEBUG -DACCT_CLOSE -DACCT_SELECT -DACCT_NONBLOCK -pthread -o acc_test - HANG
cc acc_test.c -O0 -DDEBUG -DACCT_CLOSE -DACCT_POLL -pthread -o acc_test - OK
cc acc_test.c -O0 -DDEBUG -DACCT_CLOSE -DACCT_POLL -DACCT_NONBLOCK -pthread -o acc_test - OK
cc acc_test.c -O0 -DDEBUG -DACCT_SHUTDOWN -pthread -o acc_test - OK
cc acc_test.c -O0 -DDEBUG -DACCT_SHUTDOWN -DACCT_SELECT -pthread -o acc_test - HANG
cc acc_test.c -O0 -DDEBUG -DACCT_SHUTDOWN -DACCT_SELECT -DACCT_NONBLOCK -pthread -o acc_test - HANG
cc acc_test.c -O0 -DDEBUG -DACCT_SHUTDOWN -DACCT_POLL -pthread -o acc_test - OK
cc acc_test.c -O0 -DDEBUG -DACCT_SHUTDOWN -DACCT_POLL -DACCT_NONBLOCK -pthread -o acc_test - OK

FreeBSD 11:
cc acc_test.c -O0 -DDEBUG -DACCT_CLOSE -pthread -o acc_test - HANG
cc acc_test.c -O0 -DDEBUG -DACCT_CLOSE -DACCT_SELECT -pthread -o acc_test - HANG
cc acc_test.c -O0 -DDEBUG -DACCT_CLOSE -DACCT_SELECT -DACCT_NONBLOCK -pthread -o acc_test - HANG
cc acc_test.c -O0 -DDEBUG -DACCT_CLOSE -DACCT_POLL -pthread -o acc_test - OK
cc acc_test.c -O0 -DDEBUG -DACCT_CLOSE -DACCT_POLL -DACCT_NONBLOCK -pthread -o acc_test - OK
cc acc_test.c -O0 -DDEBUG -DACCT_SHUTDOWN -pthread -o acc_test - HANG
cc acc_test.c -O0 -DDEBUG -DACCT_SHUTDOWN -DACCT_SELECT -pthread -o acc_test - HANG
cc acc_test.c -O0 -DDEBUG -DACCT_SHUTDOWN -DACCT_SELECT -DACCT_NONBLOCK -pthread -o acc_test - HANG
cc acc_test.c -O0 -DDEBUG -DACCT_SHUTDOWN -DACCT_POLL -pthread -o acc_test - HANG
cc acc_test.c -O0 -DDEBUG -DACCT_SHUTDOWN -DACCT_POLL -DACCT_NONBLOCK -pthread -o acc_test - HANG


Summary
FreeBSD 10:
poll() + close()
poll() + shutdown()
accept() + shutdown()

FreeBSD 11:
poll() + close()
Comment 1 rozhuk.im 2018-04-04 12:44:53 UTC
Created attachment 192208 [details]
test tool

New test tool.

FreeBSD 10:
./acc_test | grep "chk OK"
0:	socket(AF_INET, block)	... lskt  accept  shutdown  chk OK, ret code: 53 - Software caused connection abort
1:	socket(AF_INET, block)	... rskt  accept  shutdown  chk OK, ret code: 53 - Software caused connection abort
4:	socket(AF_INET, block)	... lskt  poll  shutdown  chk OK
5:	socket(AF_INET, block)	... rskt  poll  shutdown  chk OK
12:	socket(AF_INET, block)	... lskt  poll  close  chk OK
13:	socket(AF_INET, block)	... rskt  poll  close  chk OK
16:	socket(AF_INET, block)	... lskt  shutdown  accept  chk OK, ret code: 53 - Software caused connection abort
20:	socket(AF_INET, block)	... lskt  shutdown  poll  chk OK
36:	socket(AF_INET, nblock)	... lskt  poll  shutdown  chk OK
37:	socket(AF_INET, nblock)	... rskt  poll  shutdown  chk OK
42:	socket(AF_INET, nblock)	... lskt  shutdown  poll  chk OK

FreeBSD 11:
./acc_test | grep "chk OK"
12:	socket(AF_INET, block)	... lskt  poll  close  chk OK
13:	socket(AF_INET, block)	... rskt  poll  close  chk OK
Comment 2 rozhuk.im 2018-04-04 12:56:34 UTC
I do not understand why shutdown() does not generates POLLHUP/EV_EOF (EV_ERROR then add shutdowned socket) for poll() and kqueue().
Comment 3 rozhuk.im 2018-04-04 13:08:32 UTC
Why close() does not wakes thread that sleep on accept()?
Comment 4 Gleb Smirnoff freebsd_committer 2018-04-05 21:33:33 UTC
Can you please confirm that behavior changed for FreeBSD 11, not 12? I would expect to have a regression in 12, since there was a big change to listening sockets there, but not in 11.
Comment 5 rozhuk.im 2018-04-05 21:38:28 UTC
FreeBSD rimwks 11.1-STABLE FreeBSD 11.1-STABLE  r331113M  amd64
I do not try 12.
You can run test from attach and see results.
Comment 6 rozhuk.im 2018-04-05 22:22:45 UTC
Created attachment 192260 [details]
test tool

I do few modifications to run on linux, also add epoll() to kqueue() place.

Linux ubuntux64 4.4.0-57-generic #78-Ubuntu SMP Fri Dec 9 23:50:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

./acc_test | grep "chk OK"
0:	socket(AF_INET, block)	... lskt  accept  shutdown  chk OK, ret code: 22 - Invalid argument
1:	socket(AF_INET, block)	... rskt  accept  shutdown  chk OK, ret code: 22 - Invalid argument
4:	socket(AF_INET, block)	... lskt  poll  shutdown  chk OK
5:	socket(AF_INET, block)	... rskt  poll  shutdown  chk OK
6:	socket(AF_INET, block)	... lskt  epoll  shutdown  chk OK
7:	socket(AF_INET, block)	... rskt  epoll  shutdown  chk OK
16:	socket(AF_INET, block)	... lskt  shutdown  accept  chk OK, ret code: 22 - Invalid argument
20:	socket(AF_INET, block)	... lskt  shutdown  poll  chk OK
22:	socket(AF_INET, block)	... lskt  shutdown  epoll  chk OK
36:	socket(AF_INET, nblock)	... lskt  poll  shutdown  chk OK
37:	socket(AF_INET, nblock)	... rskt  poll  shutdown  chk OK
38:	socket(AF_INET, nblock)	... lskt  epoll  shutdown  chk OK
39:	socket(AF_INET, nblock)	... rskt  epoll  shutdown  chk OK
42:	socket(AF_INET, nblock)	... lskt  shutdown  poll  chk OK
43:	socket(AF_INET, nblock)	... lskt  shutdown  epoll  chk OK

shutdown() work every where!
close() does not work.
Comment 7 rozhuk.im 2018-04-05 23:46:57 UTC
DragonFly  5.0-RELEASE DragonFly v5.0.2-RELEASE #4: Sun Dec  3 17:42:25 EST 2017

./acc_test | grep "chk OK"
0:	socket(AF_INET, block)	... lskt  accept  shutdown  chk OK, ret code: 53 - Software caused connection abort
1:	socket(AF_INET, block)	... rskt  accept  shutdown  chk OK, ret code: 53 - Software caused connection abort
4:	socket(AF_INET, block)	... lskt  poll  shutdown  chk OK
5:	socket(AF_INET, block)	... rskt  poll  shutdown  chk OK
16:	socket(AF_INET, block)	... lskt  shutdown  accept  chk OK, ret code: 53 - Software caused connection abort
20:	socket(AF_INET, block)	... lskt  shutdown  poll  chk OK
36:	socket(AF_INET, nblock)	... lskt  poll  shutdown  chk OK
37:	socket(AF_INET, nblock)	... rskt  poll  shutdown  chk OK
42:	socket(AF_INET, nblock)	... lskt  shutdown  poll  chk OK
Comment 8 emss 2018-04-06 13:36:03 UTC
Hello,

Could this issue be related to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225863 please ?

In last comment on https://freeswitch.org/jira/browse/FS-10580 M. Jerris states the following :
"so the way that is supposed to work is the close_socket(&listen_list.sock); line from the shutdown function would close the socket causing the accpet to stop blocking in the runtime function. For some reason that is not happening. This is going to be something with either freebsd or your network settings on the box causing that close to fail. I'm not sure how to troubleshoot that issue further, but add some logging around that accept in the runtime and the close in shutdown and look at open sockets and see what is happening at shutdown."
Comment 9 Jonathan T. Looney freebsd_committer 2018-04-06 14:10:15 UTC
My best guess is that this is due to a new check in soshutdown() added in stable/11:

sys/kern/uipc_socket.c:

2351	        if ((so->so_state &
2352	            (SS_ISCONNECTED | SS_ISCONNECTING | SS_ISDISCONNECTING)) == 0) {
2353	                /*
2354	                 * POSIX mandates us to return ENOTCONN when shutdown(2) is
2355	                 * invoked on a datagram sockets, however historically we would
2356	                 * actually tear socket down. This is known to be leveraged by
2357	                 * some applications to unblock process waiting in recvXXX(2)
2358	                 * by other process that it shares that socket with. Try to meet
2359	                 * both backward-compatibility and POSIX requirements by forcing
2360	                 * ENOTCONN but still asking protocol to perform pru_shutdown().
2361	                 */
2362	                if (so->so_type != SOCK_DGRAM)
2363	                        return (ENOTCONN);
2364	                soerror_enotconn = 1;
2365	        }

I don't think I ever really considered someone calling shutdown() on a listening socket. I would have expected them to use close(). My guess is that whoever added this didn't consider that case, either.

FWIW, this check was added in r285910.

I suppose the simple fix is to add a check for SOLISTENING(so) (head) or so->so_options & SO_ACCEPTCON (stable/11). However, we should consult with someone to see what POSIX requires in this case.
Comment 10 rozhuk.im 2018-04-07 16:28:07 UTC
IMHO:
 - shutdown() and close() should wakes up all thread on all descriptors that wait blocking i/o.
 - shutdown() should generate EOF for all syscals: read()/recv/write()/send()/accept()/poll()/kqueue()/etc...
 - close() should generate BADFD for all syscals: read()/recv/write()/send()/accept()/poll() and kqueue() (on close kqueue fd) for all descriptors types

See linux: if you call shutdown() then you know that everything wakes up, it is very useful.

Well, at lest return old behavior from FreeBSD 10.
Comment 11 emss 2018-05-10 18:54:47 UTC
Hi,

Any news on this subject, please ?

Regards

Éric Masson
Comment 12 Jonathan T. Looney freebsd_committer 2018-05-10 19:02:51 UTC
See D15019 and D15021. I think we're waiting for reviewers.
Comment 13 Kubilay Kocak freebsd_committer freebsd_triage 2018-05-11 10:18:46 UTC
Assign to committer resolving (authoring the in progress code)

Can re-assign if necessary.
Comment 14 emss 2018-05-11 11:12:43 UTC
Hi

11.2-PRERELEASE :
https://reviews.freebsd.org/D15021 seems to fix https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225863

Freeswitch now shuts down properly.

Regards

Éric Masson
Comment 15 Kubilay Kocak freebsd_committer freebsd_triage 2018-05-11 12:10:39 UTC
If bug 225863 and this issue are two symptoms of the same underlying cause, fixed by either of the two commits (reviews) referenced here, we should close one of the issue reports.

In this case this issue has been more fleshed out (has more activity/references), even though it was created later, so bug 225863 should be closed.


In the normal case, the later bug(s) are closed duplicates of the earliest reported bug.

@Éric Thank you for confirming the stable/11 code fixes your issue
Comment 16 emss 2018-05-11 18:43:18 UTC
*** Bug 225863 has been marked as a duplicate of this bug. ***
Comment 17 commit-hook freebsd_committer 2018-05-12 01:55:58 UTC
A commit references this bug:

Author: jtl
Date: Sat May 12 01:55:24 UTC 2018
New revision: 333511
URL: https://svnweb.freebsd.org/changeset/base/333511

Log:
  r285910 attempted to make shutdown() be POSIX compliant by returning
  ENOTCONN when shutdown() is called on unconnected sockets.  This change was
  slightly modified by r316874, which returns ENOTCONN in the case of an
  unconnected datagram socket, but still runs the shutdown code for the
  socket.  This specifically supports the case where the user-space code is
  using the shutdown() call to wakeup another thread blocked on the socket.

  In PR 227259, a user is reporting that they have code which is using
  shutdown() to wakup another thread blocked on a stream listen socket.  This
  code is failing, while it used to work on FreeBSD 10 and still works on
  Linux.

  It seems reasonable to add another exception to support something users are
  actually doing, which used to work on FreeBSD 10, and still works on Linux.
  And, it seems like it should be acceptable to POSIX, as we still return
  ENOTCONN.

  This is a direct commit to stable/11.  The listen socket code changed
  substantially in head, and the code change there will be substantially
  more complex.  In the meantime, it seems to make sense to commit this
  trivial fix to stable/11 given the fact that users appear to depend on
  this behavior, this appears to have been an unintended change in stable/11,
  and we did not announce the change.

  PR:		227259
  Reviewed by:	ed
  Approved by:	re (gjb)
  Sponsored by:	Netflix, Inc.
  Differential Revision:	https://reviews.freebsd.org/D15021
  Tested by:	Eric Masson (emss at free.fr)

Changes:
  stable/11/sys/kern/uipc_socket.c
Comment 18 commit-hook freebsd_committer 2018-10-03 17:40:44 UTC
A commit references this bug:

Author: glebius
Date: Wed Oct  3 17:40:05 UTC 2018
New revision: 339170
URL: https://svnweb.freebsd.org/changeset/base/339170

Log:
  In PR 227259, a user is reporting that they have code which is using
  shutdown() to wakeup another thread blocked on a stream listen socket.
  This code is failing, while it used to work on FreeBSD 10 and still
  works on Linux.

  It seems reasonable to add another exception to support something users are
  actually doing, which used to work on FreeBSD 10, and still works on Linux.
  And, it seems like it should be acceptable to POSIX, as we still return
  ENOTCONN.

  This patch is different to what had been committed to stable/11, since
  code around listening sockets is different. Patch in D15019 is written
  by jtl@, slightly modified by me.

  PR:		227259
  Obtained from:	jtl
  Approved by:	re (kib)
  Differential Revision:  D15019

Changes:
  head/sys/kern/uipc_socket.c