Bug 221700 - lang/python??: Add closefrom(2) support to subprocess on FreeBSD
Summary: lang/python??: Add closefrom(2) support to subprocess on FreeBSD
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: Normal Affects Many People
Assignee: Kubilay Kocak
URL: https://bugs.python.org/issue38061
Keywords: performance
Depends on: 242274
Blocks: 221696
  Show dependency treegraph
 
Reported: 2017-08-21 19:54 UTC by Ed Maste
Modified: 2019-11-29 12:46 UTC (History)
9 users (show)

See Also:
koobs: maintainer-feedback+


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ed Maste freebsd_committer freebsd_triage 2017-08-21 19:54:10 UTC
% python --version
Python 2.7.13
% pkg info python | grep Version
Version        : 2.7_3,2


% ktrace -i python -c 'import subprocess; subprocess.call("/usr/bin/true", close_fds=True);'
% kdump -s | egrep -c 'CALL.*close'
939647

excerpt from kdump:
  1958 python2.7 CALL  close(0x3)
  1958 python2.7 RET   close -1 errno 9 Bad file descriptor
  1958 python2.7 CALL  close(0x5)
  1958 python2.7 RET   close -1 errno 9 Bad file descriptor
  1958 python2.7 CALL  close(0x6)
  1958 python2.7 RET   close -1 errno 9 Bad file descriptor
  1958 python2.7 CALL  close(0x7)
  1958 python2.7 RET   close -1 errno 9 Bad file descriptor
  1958 python2.7 CALL  close(0x8)
  1958 python2.7 RET   close -1 errno 9 Bad file descriptor
  1958 python2.7 CALL  close(0x9)
  1958 python2.7 RET   close -1 errno 9 Bad file descriptor
...
  1958 python2.7 CALL  close(0xe5628)
  1958 python2.7 RET   close -1 errno 9 Bad file descriptor
  1958 python2.7 CALL  close(0xe5629)
  1958 python2.7 RET   close -1 errno 9 Bad file descriptor
  1958 python2.7 CALL  close(0xe562a)
  1958 python2.7 RET   close -1 errno 9 Bad file descriptor
  1958 python2.7 CALL  close(0xe562b)
  1958 python2.7 RET   close -1 errno 9 Bad file descriptor

This has both a performance impact and an impact on the ability to debug problems with python programs.
Comment 2 Ed Maste freebsd_committer freebsd_triage 2017-08-22 13:25:51 UTC
A minimum viable patch to address this:

diff --git a/Modules/_posixsubprocess.c b/Modules/_posixsubprocess.c
index 8c8777cfe3..d390ec8ee6 100644
--- a/Modules/_posixsubprocess.c
+++ b/Modules/_posixsubprocess.c
@@ -232,9 +232,13 @@ _close_fds_by_brute_force(long start_fd, PyObject *py_fds_to_keep)
         start_fd = keep_fd + 1;
     }
     if (start_fd <= end_fd) {
+#if defined(__FreeBSD__)
+        closefrom(start_fd);
+#else
         for (fd_num = start_fd; fd_num < end_fd; ++fd_num) {
             close(fd_num);
         }
+#endif
     }
 }
Comment 3 Conrad Meyer freebsd_committer freebsd_triage 2017-08-22 15:08:13 UTC
You might also consider closefrom() in posix_closerange() of Modules/posixmodule.c:

--- ./Modules/posixmodule.c	2015-05-23 09:09:20.000000000 -0700
+++ ./Modules/posixmodule.c	2015-07-17 15:00:38.784909475 -0700
@@ -6668,9 +6668,12 @@
     if (!PyArg_ParseTuple(args, "ii:closerange", &fd_from, &fd_to))
         return NULL;
     Py_BEGIN_ALLOW_THREADS
-    for (i = fd_from; i < fd_to; i++)
-        if (_PyVerify_fd(i))
-            close(i);
+    if (fd_to >= sysconf(_SC_OPEN_MAX)) {
+        closefrom(fd_from);
+    } else
+        for (i = fd_from; i < fd_to; i++)
+            if (_PyVerify_fd(i))
+                close(i);
     Py_END_ALLOW_THREADS
     Py_RETURN_NONE;
 }
Comment 4 Conrad Meyer freebsd_committer freebsd_triage 2017-08-22 15:26:34 UTC
(In Python3, posix_closerange() is renamed to os_closerange_impl().)
Comment 5 Kubilay Kocak freebsd_committer freebsd_triage 2017-08-25 04:07:32 UTC
Is the behaviour (not using closefrom(2)) consistent across all Python versions, including 3.x up to 3.6 and default?

Is closefrom(2) available on all (supported) FreeBSD versions?
Comment 6 Conrad Meyer freebsd_committer freebsd_triage 2017-08-25 04:11:57 UTC
(In reply to Kubilay Kocak from comment #5)
Yes; I don't believe any current version of Python uses closefrom().

closefrom() is available on all supported FreeBSD releases.

HISTORY
     The closefrom() function first appeared in FreeBSD 8.0.
Comment 7 Kubilay Kocak freebsd_committer freebsd_triage 2017-08-25 04:19:55 UTC
For explicitness, this is effectively "Use closefrom(2) on FreeBSD", correct?

We should create an issue upstream if one hasn't already been created, and any changes created should be appropriate for upstream inclusion, in particular closefrom(2) detection (using any existing mechanisms)

We can carry local ports patches until the changes are merged upstream and included in release versions
Comment 9 Kubilay Kocak freebsd_committer freebsd_triage 2018-05-11 04:43:15 UTC
This is an issue who's resolution requires an upstream patch and issue (bug report) such that it can be correctly and permanently resolved there in all relevant branches. Reassign to reporter accordingly to coordinate that.

Once a patch against upstream sources has been created and submitted, please create new separate issue blocking this one:

lang/python??: Backport use closefrom(2) on FreeBSD and assign it to python@

We can carry it in the respective lang/python?? ports until new releases land.
Comment 10 Ed Maste freebsd_committer freebsd_triage 2018-10-03 15:18:52 UTC
A twitter discussion referenced this PR and the question of async-signal-saftiness of closefrom: https://twitter.com/koobs/status/1046674765585367041

Note that closefrom is in sigaction(2)'s list of async-signal-safe functions, in the "Base Interfaces not specified as async-signal safe by POSIX, but planned to be:" section. These are interfaces that are async-signal-safe in FreeBSD, are not currently required to be so by POSIX, but are expected to be so in the future. That said, I am not sure why closefrom is in that list and not 'Extension Interfaces' as POSIX does not specify closefrom.
Comment 11 Ed Maste freebsd_committer freebsd_triage 2018-10-03 15:21:27 UTC
(In reply to Ed Maste from comment #10)
Oh, I was mistaken in the previous comment - closefrom is indeed in the "Extension Interfaces" section as it should be.
Comment 12 Kubilay Kocak freebsd_committer freebsd_triage 2018-12-21 11:04:51 UTC
See Also: base r194262 (original closefrom(2) syscall commit)
Comment 13 Conrad Meyer freebsd_committer freebsd_triage 2018-12-21 16:52:42 UTC
(In reply to Ed Maste from comment #10)
@Koobs (re: twitter), why would it not be async signal safe?

In general signal safety is more about interacting with userspace memory state of ordinary non-signal contexts given arbitrary signal interruption.  I.e., classic example is stdio FILE objects, which may be locked or have corrupt, half-modified state.  closefrom(2) is just a plain system call.

The original commit, r194262, just says:

> Note that this implementation of closefrom(2) does not make any effort to
> resolve userland races with open(2) in other threads.  As such, it is not
> multithread safe.

That is a comment on threading, not signals.  Signaled, single-threaded programs are still single-threaded; the signal context and ordinary thread context do not run concurrently.

And by definition, a user process cannot run code of any kind — signal-handler or ordinary — while that user process is in the kernel.  So closefrom(2) in signal handler cannot race with open(2) or other syscalls that create fds in single-threaded programs.

The scenario the comment refers to is something like this:

- The program has two threads and fds 0-5 allocated.
- Concurrently, thread A and B attempt to:
  A: open() a new fd
  B: closefrom(3)
- The behavior is unpredictable and depends on which thread acquires the fdesctable lock first.

[r194262]: https://svnweb.freebsd.org/base?view=revision&revision=194262
Comment 14 Conrad Meyer freebsd_committer freebsd_triage 2018-12-21 17:16:07 UTC
(In reply to Ed Maste from comment #1)
Note that upstream Python has closed all of these issues after fixing only Linux.  Also, I believe that the concerns about async signal safety come from a link to the *Linux* closefrom manual page.

Anyone have a python bugs account and want to try banging their head against that wall another time?
Comment 15 Ed Maste freebsd_committer freebsd_triage 2018-12-21 17:29:48 UTC
Unassign, I won't be able to work on this in the near future. I just want to avoid the situation we have now where python is perceived to have terrible performance on FreeBSD. Hopefully one of the FreeBSD python maintainers has a good relationship with upstream and can coordinate, otherwise I will pick this up when I can.
Comment 16 Kubilay Kocak freebsd_committer freebsd_triage 2018-12-22 00:22:20 UTC
For explicitness and reference to future readers, current state is:

1) Yep, we're now comfortable that our closefrom(2) is async-signal safe (thanks emaste, conrad, others)

2) Yep, we want to resolve this issue, it has high value. I'm happy to coordinate and usher code into ports/upstream.

3) We want to resolve this in a manner upstream will accept. I think this will be in the form of something like a configure check for closefrom(2) and conditionals in code with HAVE_CLOSEFROM. If other platforms have closefrom(2) but aren't signal safe (or there's no positive evidence they are), we could additionally if && defined(__FreeBSD__) to be conservative in the first instance. We need patches for each upstream supported Python version (currently 2.7, 3.6, 3.7, default (3.8))

4) I have a python bugs account, and have a good relationship with some core devs, who I'd like to get input/direction from (both in general and for patches we create).

5) At present we need a patch in a form similar to (3) that I can get upstream feedback on. With a patch reasonably close to (3) we can also consider carrying it locally in ports until upstream merges and releases future versions with support. I need (C) help with this if the diffs in comments are only examples and not 'complete/ready/finished'.

Open Questions:

Are there any places other than _posixsubprocess.c:_close_fds_by_brute_force and posix_closerange() that might benefit from this?
Comment 17 Conrad Meyer freebsd_committer freebsd_triage 2018-12-22 00:30:00 UTC
(In reply to Kubilay Kocak from comment #16)
Re (3), I think #ifdef __FreeBSD__ or maybe at least #ifndef __glibc__ or __linux__ are going to be a requirement from the Python community, given prior treatments of the issue.

> Are there any places other than _posixsubprocess.c:_close_fds_by_brute_force
> and posix_closerange() that might benefit from this?

I don't know the answer to that, but I would encourage tackling the low hanging fruit first without even looking for other places, to reduce scope creep, risk, and scale of change.
Comment 18 Kubilay Kocak freebsd_committer freebsd_triage 2018-12-22 00:44:15 UTC
(In reply to Conrad Meyer from comment #17)

Yep, just wanted the question there explicitly asked
Comment 19 Dima Pasechnik 2019-04-14 07:36:15 UTC
instead of checking for __FreeBSD__, there should be a call to AC_SEARCH_LIB() in Python's configure.ac, to look for `closefrom()`.
  
(Note that `closefrom()` exists on many BSD's including Darwin.
Comment 20 Jonas Palm 2019-09-05 09:28:55 UTC
Any news on this? Python 2.7 EOL is in about four months and upgrading to 3.6 is a pain as i.e. Ansible plays take a lot longer:

ansible-playbook-2.7:
57.987u 57.349s 0:42.26 272.8%  22+203k 9026+4552io 139pf+0w

ansible-playbook-3.6 doing exactly the same thing:
315.849u 422.997s 4:12.47 292.6%        7+368k 14328+4540io 273pf+0w
Comment 21 Jan Beich freebsd_committer freebsd_triage 2019-09-05 10:26:19 UTC
(In reply to Jonas Palm from comment #20)
Try "mount -t fdescfs none /dev/fd" per https://github.com/python/cpython/commit/4842efcf972e.
Comment 22 Conrad Meyer freebsd_committer freebsd_triage 2019-09-05 17:06:51 UTC
(In reply to Dima Pasechnik from comment #19)
Sure, except Python has previously rejected some implementations of closefrom() (Linux's?).  So the configure.ac check is not sufficient anyway.
Comment 23 Conrad Meyer freebsd_committer freebsd_triage 2019-09-05 17:08:46 UTC
(In reply to Jonas Palm from comment #20)
Here's a secret: nothing happens in 2020.  Python 2.7.X works as well as it ever did; Py3.x will continue to diverge as a language.  Tauthon is a 2.7 fork ("2.8") that aims to maintain 2 compatibility, while backporting some of 3.x.
Comment 24 Conrad Meyer freebsd_committer freebsd_triage 2019-09-05 17:20:54 UTC
(In reply to Jan Beich from comment #21)
Seems like they still haven't fixed _close_fds_by_brute_force(), though.  It is unfortunate we need a special linux filesystem to have working Python when it's absolutely not needed.

They try and keep some special list of fds open.  But it's very likely those special few are low numbers (since fds always open minimum possible value and most programs do not reach fd limit).

So special casing just the end of range would likely help significantly:

Just replacing this bit with closefrom():
https://github.com/python/cpython/blob/4842efcf972eee8acef0840ecca34a88945a99a7/Modules/_posixsubprocess.c#L157-L161

Or we could add a closerange() with similar semantics.  I'm not sure why we didn't.
Comment 25 Jonas Palm 2019-09-06 09:32:58 UTC
(In reply to Conrad Meyer from comment #23)
Probably not,  but I doubt that Ansible will support Tauthon or similar projects and will rather drop support for Python 2.7 in a couple of years.

With all support dropping for Python 2.7, I'd rather be ahead of the time and migrate to a supported version right now. All the software I'm using is supporting Python 3.6, so the only reason for not migrating is this problem.

(In reply to  Jan Beich from comment #21)
Thanks for the suggestion. If there isn't any progress on this bug I'll probably try this soon.
Comment 26 commit-hook freebsd_committer freebsd_triage 2019-11-29 10:56:02 UTC
A commit references this bug:

Author: koobs
Date: Fri Nov 29 10:55:03 UTC 2019
New revision: 518640
URL: https://svnweb.freebsd.org/changeset/ports/518640

Log:
  lang/python{27,35,36,37,38}: Add closefrom(2) support

  A single close(fd) syscall is cheap, but when MAXFDS (maximum file
  descriptor number) is high, the loop calling close(fd) on each file
  descriptor can take several milliseconds.

  The default value of subprocess.Popen "close_fds" parameter changed to True
  in Python 3. Compared to Python 2, close_fds=True can make Popen 10x
  slower: see bpo-37790 [1]

  The present workaround on FreeBSD to improve performance is to load and
  mount the fdescfs kernel module, but this is not enabled by default.

  This change adds minimum viable (and upstreamable) closefrom(2) syscall
  support to Python's subprocess and posix modules, improving performance
  significantly for loads that involve working with many processes, such as
  diffoscope, ansible, and many others.

  For additional optimizations, upstream recently (3.8) landed posix_spawn(2)
  support [3] and has stated that they will adopt close_range(2) after Linux
  merges it [4]. Linux/FreeBSD developers are already collaborating on
  ensuring compatible implementations, with FreeBSD's implementation pending
  in D21627. [5]

  Thank you emaste, cem, kevans for providing analysis, input,
  clarifications, comms/upstream support and patches.

  [1] https://bugs.python.org/issue37790
  [2] https://bugs.python.org/issue38061
  [3] https://bugs.python.org/issue35537
  [4] https://lwn.net/Articles/789023/
  [5] https://reviews.freebsd.org/D21627

  Additional References:

  https://bugs.python.org/issue8052
  https://bugs.python.org/issue11284
  https://bugs.python.org/issue13788
  https://bugs.python.org/issue1663329
  https://www.python.org/dev/peps/pep-0446/

  PR:		242274, 221700
  Submitted by:	kevans (emaste, cem)
  Approved by:	koobs (python (maintainer), santa)

Changes:
  head/lang/python27/Makefile
  head/lang/python27/files/patch-Modules_posixmodule.c
  head/lang/python35/Makefile
  head/lang/python35/files/patch-Modules___posixsubprocess.c
  head/lang/python35/files/patch-Modules_posixmodule.c
  head/lang/python36/Makefile
  head/lang/python36/files/patch-Modules___posixsubprocess.c
  head/lang/python36/files/patch-Modules_posixmodule.c
  head/lang/python37/Makefile
  head/lang/python37/files/patch-Modules___posixsubprocess.c
  head/lang/python37/files/patch-Modules_posixmodule.c
  head/lang/python38/Makefile
  head/lang/python38/files/patch-Modules___posixsubprocess.c
  head/lang/python38/files/patch-Modules_posixmodule.c
Comment 27 Kubilay Kocak freebsd_committer freebsd_triage 2019-11-29 10:59:19 UTC
Thank you everyone for your support and consistent encouragement and reminders to get this done. Thank you Kyle for the final QA'd patch and Ed/Conrad for the original implementations.
Comment 28 commit-hook freebsd_committer freebsd_triage 2019-11-29 12:46:09 UTC
A commit references this bug:

Author: jbeich
Date: Fri Nov 29 12:45:12 UTC 2019
New revision: 518646
URL: https://svnweb.freebsd.org/changeset/ports/518646

Log:
  gecko: drop close_fds workaround after r518640

  PR:		221700

Changes:
  head/mail/thunderbird/files/patch-bug1507655
  head/www/cliqz/files/patch-bug1507655
  head/www/firefox/files/patch-bug1507655
  head/www/firefox-esr/files/patch-bug1507655