Inherance of file descriptor and handles on Windows (PEP 446)
Hi, Guido van Rossum and others asked me details on how file descriptors and handles are inherited on Windows, for the PEP 446. http://www.python.org/dev/peps/pep-0446/ I hacked Python 3.4 to add a os.get_cloexec() function (extracted from my implementation of the PEP 433), here are some results. Python functions open(), os.open() and os.dup() create file descriptors with the HANDLE_FLAG_INHERIT flag set (cloexec=False), whereas os.pipe() creates 2 file descriptors with the HANDLE_FLAG_INHERIT flag unset (cloexec=False, see also issue #4708). Even if the HANDLE_FLAG_INHERIT flag is set, all handles are closed if subprocess is used with close_fds=True (which is the default value of the parameter), and all file descriptors are closed except 0, 1 and 2. If close_fds=False, handles with the HANDLE_FLAG_INHERIT flag set are inherited, but all file descriptors are still closed except 0, 1 and 2. (I didn't check if file descriptors 0, 1 and 2 are inherited, duplicated or new file descriptors.) The PEP 466 allows you to control which handles are inherited to child process when you use subprocess with close_fds=False. (The subprocess parameter should be called "close_handles" on Windows to avoid confusion.) Said differently: the HANDLE_FLAG_INHERIT flag only has an effect on *handles*, as indicated in its name. On Windows, file *descriptors* are never inherited (are always closed) in child processes. I don't think that it is possible to inherit file descriptors on Windows. By the way, using pass_fds on Windows raises an assertion error ("pass_fds not supported on Windows"). Another example in Python: --- import subprocess, sys code = """ import os, sys fd = int(sys.argv[1]) f = os.fdopen(fd, "rb") print(f.read()) """ f = open(__file__, "rb") fd = f.fileno() subprocess.call([sys.executable, "-c", code, str(fd)], close_fds=False) --- On Unix, the child process will write the script into stdout. On Windows, you just get an OSError(9, "Bad file descriptor") exception. To fix this example on Windows, you have to: * Retrieve the handle of the file using msvcrt.get_osfhandle() ; * Pass the handle, instead of the file descriptor, to the child ; * Create a file descriptor from the handle using msvcrt.open_osfhandle() in the child. The fix would be simpler if Python would provide the handle of a file object (ex: in a method) and if open() supported opening a handle as it does with file descriptors on UNIX. Victor
On 23/07/2013 11:45pm, Victor Stinner wrote:
Said differently: the HANDLE_FLAG_INHERIT flag only has an effect on *handles*, as indicated in its name. On Windows, file*descriptors* are never inherited (are always closed) in child processes. I don't think that it is possible to inherit file descriptors on Windows.
Actually, you can inherit fds if you use os.spawnv() instead of subprocess.Popen(). -- Richard
On Tue, Jul 23, 2013 at 4:21 PM, Richard Oudkerk
On 23/07/2013 11:45pm, Victor Stinner wrote:
Said differently: the HANDLE_FLAG_INHERIT flag only has an effect on *handles*, as indicated in its name. On Windows, file*descriptors* are never inherited (are always closed) in child processes. I don't think that it is possible to inherit file descriptors on Windows.
Actually, you can inherit fds if you use os.spawnv() instead of subprocess.Popen().
Wow. Indeed you can -- I just tested this myself. How is this accomplished? I guess the CRT has a backchannel to talk to itself when it creates a process using spawn*? This is the only reason I can think of for the odd default in the CRT of opening file descriptors inheritable by default, which Victor discovered. (But it doesn't explain why os.pipe() creates uninheritable fds.) If it weren't for this I would definitely vote to change the default on Windows throughout the stdlib to create file descriptors whose handles aren't inheritable. (Perhaps with a different policy for stdin/stdout/stderr, which seem to be treated specially at the handle level.) I'm about ready to give up hope that we'll ever have a decent way to deal with this. But I'm also ready to propose that all this is such a mess that we *should* change the default fd/handle inheritance to False, *across platforms*, and damn the torpedoes -- i.e. accept breaking all existing 3rd party UNIX code for subprocess creation that bypasses the subprocess module, as well as breaking uses of os.spawn*() on both platforms that depend on FD inheritance beyond stdin/stdout/stderr). With the new, sane default, all we need instead of PEP 446 is a way to make an FD inheritable after it's been created, which can be a single os.make_inheritable(fd) call that you must apply to the fileno() of the stream or socket object you want inherited (or directly to a FD you created otherwise, e.g. with os.pipe()). On Windows, this should probably only work with os.spawn*(), since otherwise you need *handle* inheritance, not *FD* inheritance, and that's a non-portable concept anyway. We can fix multiprocessing any anything else in the stdlib that this breaks, I presume. To reduce the need for 3rd party subprocess creation code, we should have better daemon creation code in the stdlib -- I wrote some damn robust code for this purpose in my previous job, but it never saw the light of day. -- --Guido van Rossum (python.org/~guido)
Wow. Indeed you can -- I just tested this myself. How is this accomplished? I guess the CRT has a backchannel to talk to itself when it creates a process using spawn*?
CreateProcess() takes a STARTUPINFO argument with undocumented fields cbReserved2, lpReserved2. They are used to pass an array of fds.
But I'm also ready to propose that all this is such a mess that we *should* change the default fd/handle inheritance to False, *across platforms*, and damn the torpedoes -- i.e. accept breaking all existing 3rd party UNIX code for subprocess creation that bypasses the subprocess module, as well as breaking uses of os.spawn*() on both platforms that depend on FD inheritance beyond stdin/stdout/stderr).
+1
We can fix multiprocessing any anything else in the stdlib that this breaks, I presume.
In the experimental branch of multiprocessing, child processes no longer inherit unnecessary handles. -- Richard
On Wed, Jul 24, 2013 at 11:13 AM, Richard Oudkerk
Wow. Indeed you can -- I just tested this myself. How is this accomplished? I guess the CRT has a backchannel to talk to itself when it creates a process using spawn*?
CreateProcess() takes a STARTUPINFO argument with undocumented fields cbReserved2, lpReserved2. They are used to pass an array of fds.
Does it also inherit sockets (which take up a different namespace than regular FDs in CRT, unlike UNIX)?
But I'm also ready to propose that all this is such a mess that we
*should* change the default fd/handle inheritance to False, *across platforms*, and damn the torpedoes -- i.e. accept breaking all existing 3rd party UNIX code for subprocess creation that bypasses the subprocess module, as well as breaking uses of os.spawn*() on both platforms that depend on FD inheritance beyond stdin/stdout/stderr).
+1
Thanks! This was a difficult conclusion to come to. "Damn the torpedoes" is occasionally a nice meme. :-(
We can fix multiprocessing any anything else in the stdlib that this breaks, I presume.
In the experimental branch of multiprocessing, child processes no longer inherit unnecessary handles.
And if we default individual handles to uninheritable, we can presumably fix the ones that multiprocessing creates with the express purpose of being inherited easily. (If it even uses that -- I haven't read the source code, maybe it uses named pipes? :-) -- --Guido van Rossum (python.org/~guido)
On 24/07/2013 7:17pm, Guido van Rossum wrote:
Does it also inherit sockets (which take up a different namespace than regular FDs in CRT, unlike UNIX)?
Not reliably. Processes created with CreateProcess() seems to inherit socket handles just like normal handles on my computer, but on some other computers -- with the same Windows version! -- it appears not to work. See http://bugs.python.org/issue17399. I think WSADuplicateSocket() should be used instead.
And if we default individual handles to uninheritable, we can presumably fix the ones that multiprocessing creates with the express purpose of being inherited easily. (If it even uses that -- I haven't read the source code, maybe it uses named pipes?
multiprocessing never really needs to create any inheritable handles: it can use DuplicateHandle() to transfer each handle directly to the child process. -- Richard
2013/7/24 Richard Oudkerk
Wow. Indeed you can -- I just tested this myself. How is this accomplished? I guess the CRT has a backchannel to talk to itself when it creates a process using spawn*?
CreateProcess() takes a STARTUPINFO argument with undocumented fields cbReserved2, lpReserved2. They are used to pass an array of fds.
So would it be possible to implement the pass_fds parameter of subprocess using spawnl() or the undocumented fields? And is it possible to close all handles except one (implement "pass_handles")? The idea would be to use something subprocess.Popen(cmd, pass_fds=[pipe_rfd], close_fds=True) or subprocess.Popen(cmd, pass_handles=[pipe_rhandle], close_handles=True) instead of subprocess.Popen(cmd, close_fds=False).
In the experimental branch of multiprocessing, child processes no longer inherit unnecessary handles.
Where is this branch? How did you create the channel between the manager and the worker? Victor
On 24/07/2013 10:50pm, Victor Stinner wrote:
So would it be possible to implement the pass_fds parameter of subprocess using spawnl() or the undocumented fields?
Not in a non-racy way. spawnv() calls CreateProcess() with bInheritHandles=TRUE, so *all* inheritable handles are inherited by the child. The passing of the arrays of fds just makes the fds in the child process match the fds in the parent. If you have a Visual Studio installed then the relevant code is in .../Microsoft Visual Studio 10.0/VC/crt/src/dospawn.c
And is it possible to close all handles except one (implement "pass_handles")?
I don't know how to do that.
In the experimental branch of multiprocessing, child processes no longer inherit unnecessary handles.
Where is this branch? How did you create the channel between the manager and the worker?
http://hg.python.org/sandbox/sbt/ The parent creates a pipe and starts the child process. The pid of the parent, and the handle for the read end of the pipe are passed on the command line. Then the child "steals" the handle from the parent using OpenProcess() and DuplicateHandle() using the DUPLICATE_CLOSE_SOURCE flag. -- Richard
On Wed, 24 Jul 2013 10:56:05 -0700
Guido van Rossum
But I'm also ready to propose that all this is such a mess that we *should* change the default fd/handle inheritance to False, *across platforms*, and damn the torpedoes -- i.e. accept breaking all existing 3rd party UNIX code for subprocess creation that bypasses the subprocess module, as well as breaking uses of os.spawn*() on both platforms that depend on FD inheritance beyond stdin/stdout/stderr).
So I suppose you mean "change it to False except for stdin/stdout/stderr"?
With the new, sane default, all we need instead of PEP 446 is a way to make an FD inheritable after it's been created, which can be a single os.make_inheritable(fd) call that you must apply to the fileno() of the stream or socket object you want inherited (or directly to a FD you created otherwise, e.g. with os.pipe()). On Windows, this should probably only work with os.spawn*(), since otherwise you need *handle* inheritance, not *FD* inheritance, and that's a non-portable concept anyway.
I'm not sure how *fd* inheritance could work without *handle* inheritance. (since a fd seems to just be a proxy for a handle)
To reduce the need for 3rd party subprocess creation code, we should have better daemon creation code in the stdlib -- I wrote some damn robust code for this purpose in my previous job, but it never saw the light of day.
What do you call "daemon"? An actual Unix-like daemon? Regards Antoine.
On Wed, Jul 24, 2013 at 2:57 PM, Antoine Pitrou
On Wed, 24 Jul 2013 10:56:05 -0700 Guido van Rossum
wrote: But I'm also ready to propose that all this is such a mess that we *should* change the default fd/handle inheritance to False, *across platforms*, and damn the torpedoes -- i.e. accept breaking all existing 3rd party UNIX code for subprocess creation that bypasses the subprocess module, as well as breaking uses of os.spawn*() on both platforms that depend on FD inheritance beyond stdin/stdout/stderr).
So I suppose you mean "change it to False except for stdin/stdout/stderr"?
Hm, that's a tricky detail. I expect that on UNIX those are pre-opened and inherited from the shell, and we should not change their cloexec status; IIRC on Windows inheritance of the stdin/out/err handles is guided by a separate flag to CreateProcess().
With the new, sane default, all we need instead of PEP 446 is a way to make an FD inheritable after it's been created, which can be a single os.make_inheritable(fd) call that you must apply to the fileno() of the stream or socket object you want inherited (or directly to a FD you created otherwise, e.g. with os.pipe()). On Windows, this should probably only work with os.spawn*(), since otherwise you need *handle* inheritance, not *FD* inheritance, and that's a non-portable concept anyway.
I'm not sure how *fd* inheritance could work without *handle* inheritance. (since a fd seems to just be a proxy for a handle)
The MS spawn() implementation takes care of this -- I presume they make the handles inheritable (I think some of the bugs probably refer to this fact) and then use the "internal" interface to pass the mapping from FDs to handles to the subprocess.
To reduce the need for 3rd party subprocess creation code, we should have better daemon creation code in the stdlib -- I wrote some damn robust code for this purpose in my previous job, but it never saw the light of day.
What do you call "daemon"? An actual Unix-like daemon?
Yeah, a background process with parent PID 1 and not associated with any terminal group. -- --Guido van Rossum (python.org/~guido)
On Wed, 24 Jul 2013 15:25:50 -0700
Guido van Rossum
To reduce the need for 3rd party subprocess creation code, we should have better daemon creation code in the stdlib -- I wrote some damn robust code for this purpose in my previous job, but it never saw the light of day.
What do you call "daemon"? An actual Unix-like daemon?
Yeah, a background process with parent PID 1 and not associated with any terminal group.
But is that relevant to the PEP? A daemon only uses fork(), not exec(). Or have I misunderstood your concern? Regards Antoine.
On 25Jul2013 00:35, Antoine Pitrou
On Wed, Jul 24, 2013 at 3:35 PM, Antoine Pitrou
On Wed, 24 Jul 2013 15:25:50 -0700 Guido van Rossum
wrote: To reduce the need for 3rd party subprocess creation code, we should have better daemon creation code in the stdlib -- I wrote some damn robust code for this purpose in my previous job, but it never saw the light of day.
What do you call "daemon"? An actual Unix-like daemon?
Yeah, a background process with parent PID 1 and not associated with any terminal group.
But is that relevant to the PEP? A daemon only uses fork(), not exec(). Or have I misunderstood your concern?
Actually it's common for "daemonization" libraries to run an arbitrary executable as a daemon, and then it would be relevant. E.g. zdaemon (https://pypi.python.org/pypi/zdaemon) does this. (Disclosure: I wrote the first version of zdaemon when I worked for Zope Corp over a decade ago. :-) -- --Guido van Rossum (python.org/~guido)
On 7/24/2013 6:25 PM, Guido van Rossum wrote:
To reduce the need for 3rd party subprocess creation code, we should have better daemon creation code in the stdlib -- I wrote some damn robust code for this purpose in my previous job, but it never saw the light of day.
What do you call "daemon"? An actual Unix-like daemon?
Yeah, a background process with parent PID 1 and not associated with any terminal group.
There's PEP 3143 and https://pypi.python.org/pypi/python-daemon. I've used it often, with great success. -- Eric.
I checked the python-daemon module: it closes all open file
descriptors except 0, 1, 2. It has a files_preserve attribute to keep
some FD opens. It redirects stdin, stdout and stderr to /dev/null and
keep these file descriptors open. If python-daemon is used to execute
a new program, the files_preserve list can be used to mark these file
descriptors as inherited.
The zdaemon.zdrun module closes all open file descriptors except 0, 1,
2. It uses also dup2() to redirect stdout and stderr to the write end
of a pipe.
Victor
2013/7/25 Eric V. Smith
On 7/24/2013 6:25 PM, Guido van Rossum wrote:
To reduce the need for 3rd party subprocess creation code, we should have better daemon creation code in the stdlib -- I wrote some damn robust code for this purpose in my previous job, but it never saw the light of day.
What do you call "daemon"? An actual Unix-like daemon?
Yeah, a background process with parent PID 1 and not associated with any terminal group.
There's PEP 3143 and https://pypi.python.org/pypi/python-daemon. I've used it often, with great success.
-- Eric. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
On 5 August 2013 22:52, Victor Stinner
I checked the python-daemon module: it closes all open file descriptors except 0, 1, 2. It has a files_preserve attribute to keep some FD opens. It redirects stdin, stdout and stderr to /dev/null and keep these file descriptors open. If python-daemon is used to execute a new program, the files_preserve list can be used to mark these file descriptors as inherited.
The zdaemon.zdrun module closes all open file descriptors except 0, 1, 2. It uses also dup2() to redirect stdout and stderr to the write end of a pipe.
So closed by default, and directing people towards subprocess and python-daemon if they need to keep descriptors open sounds really promising. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Guido van Rossum
To reduce the need for 3rd party subprocess creation code, we should have better daemon creation code in the stdlib -- I wrote some damn robust code for this purpose in my previous job, but it never saw the light of day.
Work continues on the PEP 3143-compatible ‘python-daemon’, porting it to Python 3 and aiming for inclusion in the standard library. Interested parties are invited to join us on the discussion forums URL:http://lists.alioth.debian.org/cgi-bin/mailman/admin/python-daemon-devel. -- \ “Politics is not the art of the possible. It consists in | `\ choosing between the disastrous and the unpalatable.” —John | _o__) Kenneth Galbraith, 1962-03-02 | Ben Finney
Ben Finney
Work continues on the PEP 3143-compatible ‘python-daemon’, porting it to Python 3 and aiming for inclusion in the standard library.
At PyPI URL:http://pypi.python.org/pypi/python-daemon/, and development co-ordinated at Alioth URL:https://alioth.debian.org/projects/python-daemon/.
Interested parties are invited to join us on the discussion forums
The correct link for the ‘python-daemon-devel’ forum is URL:http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/python-daemon-de.... For announcements only, we have URL:http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/python-daemon-an.... -- \ “This sentence contradicts itself — no actually it doesn't.” | `\ —Douglas Hofstadter | _o__) | Ben Finney
On 25 Jul, 2013, at 4:18, Ben Finney
Ben Finney
writes: Work continues on the PEP 3143-compatible ‘python-daemon’, porting it to Python 3 and aiming for inclusion in the standard library.
At first glance the library appears to close all open files, with an option to exclude some specific file descriptors (that is, you need to pass a list of files that shouldn't be closed). That makes it a lot harder to do some initialization before daemonizing. I prefer to perform at least some initialization early in program startup to be able to give sensible error messages. I've had too many initscripts that claimed to have started a daemon sucessfully, only to have that daemon stop right away because it noticed a problem right after it detached itself. Ronald
At PyPI URL:http://pypi.python.org/pypi/python-daemon/, and development co-ordinated at Alioth URL:https://alioth.debian.org/projects/python-daemon/.
Interested parties are invited to join us on the discussion forums
The correct link for the ‘python-daemon-devel’ forum is URL:http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/python-daemon-de.... For announcements only, we have URL:http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/python-daemon-an....
-- \ “This sentence contradicts itself — no actually it doesn't.” | `\ —Douglas Hofstadter | _o__) | Ben Finney
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/ronaldoussoren%40mac.com
On Fri, 26 Jul 2013 09:38:10 +0200
Ronald Oussoren
On 25 Jul, 2013, at 4:18, Ben Finney
wrote: Ben Finney
writes: Work continues on the PEP 3143-compatible ‘python-daemon’, porting it to Python 3 and aiming for inclusion in the standard library.
At first glance the library appears to close all open files, with an option to exclude some specific file descriptors (that is, you need to pass a list of files that shouldn't be closed).
Indeed, it's annoying when you want to setup logging before daemonization starts. I had to hack my way through logging handlers to find the fd I had to keep open.
That makes it a lot harder to do some initialization before daemonizing. I prefer to perform at least some initialization early in program startup to be able to give sensible error messages. I've had too many initscripts that claimed to have started a daemon sucessfully, only to have that daemon stop right away because it noticed a problem right after it detached itself.
Agreed. Regards Antoine.
Le Thu, 25 Jul 2013 12:08:18 +1000,
Ben Finney
Guido van Rossum
writes: To reduce the need for 3rd party subprocess creation code, we should have better daemon creation code in the stdlib -- I wrote some damn robust code for this purpose in my previous job, but it never saw the light of day.
Work continues on the PEP 3143-compatible ‘python-daemon’, porting it to Python 3 and aiming for inclusion in the standard library.
The PEP hasn't been formally accepted yet, however. Skimming back through the archives, one sticking point was the default value of the "umask" parameter. Setting the umask to 0 if the user didn't ask for something else is a disaster, security-wise. Another problem I've had when using it is that the `pidfile` combines the notion of pidfile and process lock in one unique attribute. This is quite inflexible when you're using something else than Skip Montanaro's "lockfile" library. I'm using a separate lock based on locket.py: https://github.com/mwilliamson/locket.py because it is based on POSIX advisory locks, and therefore doesn't suffer from the "stale pid file" issues you get when a process (or the whole system) crashes. Therefore I'd be -1 on the PEP until those issues are alleviated. Regards Antoine.
On 25Jul2013 17:26, Antoine Pitrou
Antoine Pitrou
Therefore I'd be -1 on [PEP 3143] until those issues are alleviated.
Cameron Simpson
I have always found the convention that daemons have a umask of 0 to be utterly bogus, because almost all library code relies on the umask to set default security policy for initial file permissions.
Prone to rant on this at length if required...
Thanks folks. We'd love to have this discussion over at the ‘python-daemon-devel’ discussion forum if you want to have it in more detail. -- \ “We are all agreed that your theory is crazy. The question that | `\ divides us is whether it is crazy enough to have a chance of | _o__) being correct.” —Niels Bohr (to Wolfgang Pauli), 1958 | Ben Finney
2013/7/24 Guido van Rossum
But I'm also ready to propose that all this is such a mess that we *should* change the default fd/handle inheritance to False, *across platforms*, and damn the torpedoes -- i.e. accept breaking all existing 3rd party UNIX code for subprocess creation that bypasses the subprocess module, as well as breaking uses of os.spawn*() on both platforms that depend on FD inheritance beyond stdin/stdout/stderr).
With the new, sane default, all we need instead of PEP 446 is a way to make an FD inheritable after it's been created, which can be a single os.make_inheritable(fd) call that you must apply to the fileno() of the stream or socket object you want inherited (or directly to a FD you created otherwise, e.g. with os.pipe()). On Windows, this should probably only work with os.spawn*(), since otherwise you need *handle* inheritance, not *FD* inheritance, and that's a non-portable concept anyway.
After having written 2 PEP on the topic, I slowly agree that make all file descriptors non-inheritable is the best *compromise*. It solves most, or all, issues. The main drawback is the additionnal syscalls: on some platforms, 2 additional syscalls are need to make a file descriptor non-inheritable for each creation of file descriptor. According to my benchmark on the implementation of the PEP 433: the overhead of making a file descriptor non-inheritable is between 1% and 3% (7.8 µs => 7.9 or 8.0 µs) on Linux 3.6. http://www.python.org/dev/peps/pep-0433/#performances Having to make a file descriptor inheritable after creating it non-inheritable is also not optimal. Making it first non-inheritable requires 0, 1 or 2 extra syscalls, and making it inheritable again require also 1 or 2 syscalls. So f=open(...); os.make_inheritable(f.fileno()) can take up to 5 syscalls (1 open + 4 fnctl), whereas it can be done in only 1 syscall (1 open). One of the motivation of the PEP 433 an 446 is to reduce the number of syscalls, even if the use case was to make sockets *non-inheritable*. If we consider that the most common case is to use non-inheritable file descriptors, having to call os.make_inheritable() may be acceptable. Windows and recent operating syscalls support creating file descriptor directly non-inheritable in a single syscalls. ioctl() can be also be used instead of fcntl() to use 1 syscall instead of 2.
We can fix multiprocessing any anything else in the stdlib that this breaks, I presume.
The CGI code rely on inheritance of file descriptors 0, 1 and 2 which are pipes. The file descriptors 0, 1 and 2 are replaced with the pipes using os.dup2(). Victor
Le Fri, 26 Jul 2013 14:08:35 +0200,
Victor Stinner
After having written 2 PEP on the topic, I slowly agree that make all file descriptors non-inheritable is the best *compromise*. It solves most, or all, issues.
Even stdin/stdout/stderr? I think inheriting them is the sane default.
The main drawback is the additionnal syscalls: on some platforms, 2 additional syscalls are need to make a file descriptor non-inheritable for each creation of file descriptor. According to my benchmark on the implementation of the PEP 433: the overhead of making a file descriptor non-inheritable is between 1% and 3% (7.8 µs => 7.9 or 8.0 µs) on Linux 3.6.
1% and 3% of what? You're telling us there's a 0.1µs overhead. It's positively tiny. Regards Antoine.
2013/7/26 Antoine Pitrou
The main drawback is the additionnal syscalls: on some platforms, 2 additional syscalls are need to make a file descriptor non-inheritable for each creation of file descriptor. According to my benchmark on the implementation of the PEP 433: the overhead of making a file descriptor non-inheritable is between 1% and 3% (7.8 µs => 7.9 or 8.0 µs) on Linux 3.6.
1% and 3% of what? You're telling us there's a 0.1µs overhead. It's positively tiny.
Copy-paste of the link: """ On Linux, setting the close-on-flag has a low overhead on performances. Results of bench_cloexec.py on Linux 3.6: - close-on-flag not set: 7.8 us - O_CLOEXEC: 1% slower (7.9 us) - ioctl(): 3% slower (8.0 us) - fcntl(): 3% slower (8.0 us) """ The overhead is between 0.1 and 0.2 µs (100 and 200 ns) according to my micro-benchmark. "python -c pass" takes 19,000 µs (0.019 sec) on my PC. It uses 207 syscalls creating file descriptors (open() and openat()): 67 are successful, 140 are failing with ENOENT. The estimated overhead on "python -c pass" is 0.2*67=13.4 µs (0.07%). Victor
On Fri, 26 Jul 2013 22:17:47 +0200
Victor Stinner
2013/7/26 Antoine Pitrou
: The main drawback is the additionnal syscalls: on some platforms, 2 additional syscalls are need to make a file descriptor non-inheritable for each creation of file descriptor. According to my benchmark on the implementation of the PEP 433: the overhead of making a file descriptor non-inheritable is between 1% and 3% (7.8 µs => 7.9 or 8.0 µs) on Linux 3.6.
1% and 3% of what? You're telling us there's a 0.1µs overhead. It's positively tiny.
Copy-paste of the link:
""" On Linux, setting the close-on-flag has a low overhead on performances. Results of bench_cloexec.py on Linux 3.6:
- close-on-flag not set: 7.8 us - O_CLOEXEC: 1% slower (7.9 us) - ioctl(): 3% slower (8.0 us) - fcntl(): 3% slower (8.0 us) """
You aren't answering my question: slower than what? Benchmarking is useless if you aren't telling us what exactly you are benchmarking.
The overhead is between 0.1 and 0.2 µs (100 and 200 ns) according to my micro-benchmark.
That's what I figured out (see above). It's tiny. Antoine.
2013/7/26 Antoine Pitrou
On Fri, 26 Jul 2013 22:17:47 +0200
""" On Linux, setting the close-on-flag has a low overhead on performances. Results of bench_cloexec.py on Linux 3.6:
- close-on-flag not set: 7.8 us - O_CLOEXEC: 1% slower (7.9 us) - ioctl(): 3% slower (8.0 us) - fcntl(): 3% slower (8.0 us) """
You aren't answering my question: slower than what?
Ah, you didn't understand the labels. bench_cloexec.py runs a benchmark on os.open(path, os.O_RDONLY, cloexec=False) and os.open(path, os.O_RDONLY, cloexec=True) with different implementation of making the file descriptor non-inheritable. close-on-flag not set: 7.8 us => C code: open(path, O_RDONLY) O_CLOEXEC: 1% slower (7.9 us) => C code: open(path, O_RDONLY|CLOEXEC) => 1% slower than open(path, O_RDONLY) ioctl(): 3% slower (8.0 us) => C code: fd=open(path, O_RDONLY); ioctl(fd, FIOCLEX, 0) => 3% slower than open(path, O_RDONLY) fcntl(): 3% slower (8.0 us) => C code: fd=open(path, O_RDONLY); flags = fcntl(fd, F_GETFD); fcntl(fd, F_SETFD, flags | FD_CLOEXEC) => 3% slower than open(path, O_RDONLY) Victor
On Sat, 27 Jul 2013 00:18:40 +0200
Victor Stinner
2013/7/26 Antoine Pitrou
: On Fri, 26 Jul 2013 22:17:47 +0200
""" On Linux, setting the close-on-flag has a low overhead on performances. Results of bench_cloexec.py on Linux 3.6:
- close-on-flag not set: 7.8 us - O_CLOEXEC: 1% slower (7.9 us) - ioctl(): 3% slower (8.0 us) - fcntl(): 3% slower (8.0 us) """
You aren't answering my question: slower than what?
Ah, you didn't understand the labels. bench_cloexec.py runs a benchmark on os.open(path, os.O_RDONLY, cloexec=False) and os.open(path, os.O_RDONLY, cloexec=True) with different implementation of making the file descriptor non-inheritable.
close-on-flag not set: 7.8 us => C code: open(path, O_RDONLY)
O_CLOEXEC: 1% slower (7.9 us) => C code: open(path, O_RDONLY|CLOEXEC) => 1% slower than open(path, O_RDONLY)
ioctl(): 3% slower (8.0 us) => C code: fd=open(path, O_RDONLY); ioctl(fd, FIOCLEX, 0) => 3% slower than open(path, O_RDONLY)
fcntl(): 3% slower (8.0 us) => C code: fd=open(path, O_RDONLY); flags = fcntl(fd, F_GETFD); fcntl(fd, F_SETFD, flags | FD_CLOEXEC) => 3% slower than open(path, O_RDONLY)
Ok, so I think this it is a totally reasonable compromise. People who bother about a 3% slowdown when calling os.open() can optimize the hell out of their code using Cython for all I care :-) Regards Antoine.
On Fri, Jul 26, 2013 at 3:23 PM, Antoine Pitrou
On Sat, 27 Jul 2013 00:18:40 +0200 Victor Stinner
wrote: 2013/7/26 Antoine Pitrou
: On Fri, 26 Jul 2013 22:17:47 +0200
""" On Linux, setting the close-on-flag has a low overhead on performances. Results of bench_cloexec.py on Linux 3.6:
- close-on-flag not set: 7.8 us - O_CLOEXEC: 1% slower (7.9 us) - ioctl(): 3% slower (8.0 us) - fcntl(): 3% slower (8.0 us) """
You aren't answering my question: slower than what?
Ah, you didn't understand the labels. bench_cloexec.py runs a benchmark on os.open(path, os.O_RDONLY, cloexec=False) and os.open(path, os.O_RDONLY, cloexec=True) with different implementation of making the file descriptor non-inheritable.
close-on-flag not set: 7.8 us => C code: open(path, O_RDONLY)
O_CLOEXEC: 1% slower (7.9 us) => C code: open(path, O_RDONLY|CLOEXEC) => 1% slower than open(path, O_RDONLY)
ioctl(): 3% slower (8.0 us) => C code: fd=open(path, O_RDONLY); ioctl(fd, FIOCLEX, 0) => 3% slower than open(path, O_RDONLY)
fcntl(): 3% slower (8.0 us) => C code: fd=open(path, O_RDONLY); flags = fcntl(fd, F_GETFD); fcntl(fd, F_SETFD, flags | FD_CLOEXEC) => 3% slower than open(path, O_RDONLY)
Ok, so I think this it is a totally reasonable compromise.
People who bother about a 3% slowdown when calling os.open() can optimize the hell out of their code using Cython for all I care :-)
+1 ;) and +1 for making the sane default of noinherit / cloexec / whatever-others-call-it by default for all fds/handles ever opened by Python. It stops ignoring the issue (ie: the status quo of matching the default behavior of C as defined in the 1970s)... That is a GOOD thing. :) -gps
On Fri, Jul 26, 2013 at 9:26 PM, Gregory P. Smith
On Fri, Jul 26, 2013 at 3:23 PM, Antoine Pitrou
wrote: On Sat, 27 Jul 2013 00:18:40 +0200 Victor Stinner
wrote: 2013/7/26 Antoine Pitrou
: On Fri, 26 Jul 2013 22:17:47 +0200
""" On Linux, setting the close-on-flag has a low overhead on performances. Results of bench_cloexec.py on Linux 3.6:
- close-on-flag not set: 7.8 us - O_CLOEXEC: 1% slower (7.9 us) - ioctl(): 3% slower (8.0 us) - fcntl(): 3% slower (8.0 us) """
You aren't answering my question: slower than what?
Ah, you didn't understand the labels. bench_cloexec.py runs a benchmark on os.open(path, os.O_RDONLY, cloexec=False) and os.open(path, os.O_RDONLY, cloexec=True) with different implementation of making the file descriptor non-inheritable.
close-on-flag not set: 7.8 us => C code: open(path, O_RDONLY)
O_CLOEXEC: 1% slower (7.9 us) => C code: open(path, O_RDONLY|CLOEXEC) => 1% slower than open(path, O_RDONLY)
ioctl(): 3% slower (8.0 us) => C code: fd=open(path, O_RDONLY); ioctl(fd, FIOCLEX, 0) => 3% slower than open(path, O_RDONLY)
fcntl(): 3% slower (8.0 us) => C code: fd=open(path, O_RDONLY); flags = fcntl(fd, F_GETFD); fcntl(fd, F_SETFD, flags | FD_CLOEXEC) => 3% slower than open(path, O_RDONLY)
Ok, so I think this it is a totally reasonable compromise.
People who bother about a 3% slowdown when calling os.open() can optimize the hell out of their code using Cython for all I care :-)
+1 ;)
and +1 for making the sane default of noinherit / cloexec / whatever-others-call-it by default for all fds/handles ever opened by Python. It stops ignoring the issue (ie: the status quo of matching the default behavior of C as defined in the 1970s)... That is a GOOD thing. :)
Do we even need a new PEP, or should we just do it? Or can we adapt Victor's PEP 446? -- --Guido van Rossum (python.org/~guido)
On 27 July 2013 14:36, Guido van Rossum
On Fri, Jul 26, 2013 at 9:26 PM, Gregory P. Smith
wrote: On Fri, Jul 26, 2013 at 3:23 PM, Antoine Pitrou
wrote: On Sat, 27 Jul 2013 00:18:40 +0200 Victor Stinner
wrote: 2013/7/26 Antoine Pitrou
: On Fri, 26 Jul 2013 22:17:47 +0200
""" On Linux, setting the close-on-flag has a low overhead on performances. Results of bench_cloexec.py on Linux 3.6:
- close-on-flag not set: 7.8 us - O_CLOEXEC: 1% slower (7.9 us) - ioctl(): 3% slower (8.0 us) - fcntl(): 3% slower (8.0 us) """
You aren't answering my question: slower than what?
Ah, you didn't understand the labels. bench_cloexec.py runs a benchmark on os.open(path, os.O_RDONLY, cloexec=False) and os.open(path, os.O_RDONLY, cloexec=True) with different implementation of making the file descriptor non-inheritable.
close-on-flag not set: 7.8 us => C code: open(path, O_RDONLY)
O_CLOEXEC: 1% slower (7.9 us) => C code: open(path, O_RDONLY|CLOEXEC) => 1% slower than open(path, O_RDONLY)
ioctl(): 3% slower (8.0 us) => C code: fd=open(path, O_RDONLY); ioctl(fd, FIOCLEX, 0) => 3% slower than open(path, O_RDONLY)
fcntl(): 3% slower (8.0 us) => C code: fd=open(path, O_RDONLY); flags = fcntl(fd, F_GETFD); fcntl(fd, F_SETFD, flags | FD_CLOEXEC) => 3% slower than open(path, O_RDONLY)
Ok, so I think this it is a totally reasonable compromise.
People who bother about a 3% slowdown when calling os.open() can optimize the hell out of their code using Cython for all I care :-)
+1 ;)
and +1 for making the sane default of noinherit / cloexec / whatever-others-call-it by default for all fds/handles ever opened by Python. It stops ignoring the issue (ie: the status quo of matching the default behavior of C as defined in the 1970s)... That is a GOOD thing. :)
Do we even need a new PEP, or should we just do it? Or can we adapt Victor's PEP 446?
Adapting the PEP sounds good - while I agree with switching to a sane default, I think the daemonisation thread suggests there may need to be a supporting API to help force FDs created by nominated logging handlers to be inherited. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
2013/7/27 Nick Coghlan
Do we even need a new PEP, or should we just do it? Or can we adapt Victor's PEP 446?
Adapting the PEP sounds good - while I agree with switching to a sane default, I think the daemonisation thread suggests there may need to be a supporting API to help force FDs created by nominated logging handlers to be inherited.
Why would a Python logging handler be used in a child process? If the child process is a fresh Python process, it starts with the default logging handlers (no handler). Files opened by the logging module must be closed on exec(). Victor
On Mon, 5 Aug 2013 14:56:06 +0200
Victor Stinner
2013/7/27 Nick Coghlan
: Do we even need a new PEP, or should we just do it? Or can we adapt Victor's PEP 446?
Adapting the PEP sounds good - while I agree with switching to a sane default, I think the daemonisation thread suggests there may need to be a supporting API to help force FDs created by nominated logging handlers to be inherited.
Why would a Python logging handler be used in a child process? If the child process is a fresh Python process, it starts with the default logging handlers (no handler).
Files opened by the logging module must be closed on exec().
I agree with this. It is only on fork()-without-exec() that the behaviour of python-daemon is actively anti-social. Regards Antoine.
2013/7/27 Guido van Rossum
Do we even need a new PEP, or should we just do it? Or can we adapt Victor's PEP 446?
I can rewrite the PEP 446 to: * make all file descriptors and handles non-inheritable * remove the cloexec parameter * remove everything about non-blocking sockets (O_NONBLOCK), it should be discussed in a new PEP (it's no more related to O_CLOEXEC / HANDLE_INHERIT_FLAG) Should I rename os.set_cloexec(fd, cloexec) to os.set_inheritable(fd, inheritable), and os.get_cloexec(fd) to os.get_inheritable(fd)? Or do you prefer a simple os.make_inheritable(fd) with no inheritable parameter? I prefer an explicit parameter, so it's also possible to force again non-inheritable, which also makes sense if the file descriptor was not created by Python. Victor
On Saturday, July 27, 2013, Victor Stinner wrote:
2013/7/27 Guido van Rossum
javascript:;>: Do we even need a new PEP, or should we just do it? Or can we adapt Victor's PEP 446?
I can rewrite the PEP 446 to:
* make all file descriptors and handles non-inheritable * remove the cloexec parameter * remove everything about non-blocking sockets (O_NONBLOCK), it should be discussed in a new PEP (it's no more related to O_CLOEXEC / HANDLE_INHERIT_FLAG)
Sounds good.
Should I rename os.set_cloexec(fd, cloexec) to os.set_inheritable(fd, inheritable), and os.get_cloexec(fd) to os.get_inheritable(fd)?
Yes.
Or do you prefer a simple os.make_inheritable(fd) with no inheritable parameter? I prefer an explicit parameter, so it's also possible to force again non-inheritable, which also makes sense if the file descriptor was not created by Python.
Agreed.
Victor
-- --Guido van Rossum (on iPad)
P.S. perhaps more important than a PEP rewrite is a working patch to see how realistic this is. Could you make the alpha 1 release? On Saturday, July 27, 2013, Guido van Rossum wrote:
On Saturday, July 27, 2013, Victor Stinner wrote:
2013/7/27 Guido van Rossum
: Do we even need a new PEP, or should we just do it? Or can we adapt Victor's PEP 446?
I can rewrite the PEP 446 to:
* make all file descriptors and handles non-inheritable * remove the cloexec parameter * remove everything about non-blocking sockets (O_NONBLOCK), it should be discussed in a new PEP (it's no more related to O_CLOEXEC / HANDLE_INHERIT_FLAG)
Sounds good.
Should I rename os.set_cloexec(fd, cloexec) to os.set_inheritable(fd, inheritable), and os.get_cloexec(fd) to os.get_inheritable(fd)?
Yes.
Or do you prefer a simple os.make_inheritable(fd) with no inheritable parameter? I prefer an explicit parameter, so it's also possible to force again non-inheritable, which also makes sense if the file descriptor was not created by Python.
Agreed.
Victor
-- --Guido van Rossum (on iPad)
-- --Guido van Rossum (on iPad)
2013/7/27 Guido van Rossum
P.S. perhaps more important than a PEP rewrite is a working patch to see how realistic this is. Could you make the alpha 1 release?
I already ran the whole Python test suite with non-inheritable file descriptors when I developed the PEP 433: it just works. So I'm confident :-) I "just" had to fix the cgi module, and some tests. For example, test_socket checks the exact type of sockets, whereas SOCK_CLOEXEC flag is present in sockobj.type for non-inheritable sockets created with this flag. I implemented the *new* PEP 446 (not written yet :-)) in a new repository: http://hg.python.org/features/pep-446 I had to invert the value of cloexec (inheritable value is just the opposite). The implementation works but it is not completed: * The doc should be reviewed * test_swap_fds() of test_subprocess fails * The implementation should be tested on Windows, FreeBSD and Solaris * I have to check if _Py_try_set_inheritable() can/must be replaced with _Py_set_inheritable() The implementation can be seen as a patch and reviewed using the following new issue: http://bugs.python.org/issue18571 Victor
On Fri, Jul 26, 2013 at 5:08 AM, Victor Stinner
After having written 2 PEP on the topic, I slowly agree that make all file descriptors non-inheritable is the best *compromise*. It solves most, or all, issues.
Right.
The main drawback is the additionnal syscalls: on some platforms, 2 additional syscalls are need to make a file descriptor non-inheritable for each creation of file descriptor. According to my benchmark on the implementation of the PEP 433: the overhead of making a file descriptor non-inheritable is between 1% and 3% (7.8 µs => 7.9 or 8.0 µs) on Linux 3.6. http://www.python.org/dev/peps/pep-0433/#performances
Remember that this is going to be Python 3.4 and newer. AFAICT setting O_CLOEXEC on open works on OSX (at least the man page on OSX 10.8 has it), on newer Linuxes, and the equivalent on Windows. So even if it does cost an extra syscall on older systems, those systems will be obsolete before Python 3.4 becomes mainstream there. And it does look like the syscalls are pretty cheap. I also don't think I'm not particularly worried about the cost of syscalls for making a socket (non)blocking -- although we should probably avoid the second fcntl() call if the first call shows the flag is already set the way we want it. -- --Guido van Rossum (python.org/~guido)
The multiprocessing module is an example of use case relying on
inherance of handles. It calls CreateProcess() with
bInheritHandles=TRUE to share a pipe between the manager (parent) and
the worker (child process).
Note: subprocess and multiprocess have their own function to set the
HANDLE_FLAG_INHERIT flag: they use DuplicateHandle(), whereas
SetHandleInformation() could be used (to reuse the existing handle
instead of creating a new handle).
2013/7/24 Victor Stinner
Python functions open(), os.open() and os.dup() create file descriptors with the HANDLE_FLAG_INHERIT flag set (cloexec=False), whereas os.pipe() creates 2 file descriptors with the HANDLE_FLAG_INHERIT flag unset (cloexec=False, see also issue #4708). (...) If close_fds=False, handles with the HANDLE_FLAG_INHERIT flag set are inherited, but all file descriptors are still closed except 0, 1 and 2.
Leaking handles in child processes is also an issue on Windows. Random examples.
http://bugs.python.org/issue17634
"Win32: shutil.copy leaks file handles to child processes"
"Win32's native CopyFile API call doesn't leak file handles to child processes."
http://ghc.haskell.org/trac/ghc/ticket/2650
"The case in which I originally ran into this was
System.Directory.copyFile intermittently reporting a "permission
denied" error for a temp file it was using. I think it was trying to
delete it, but failing because a child process somewhere was hanging
on to the Handle."
According to the issue, GHC calls CreateProcess with
bInheritHandles=TRUE (as Python did until Python 3.2).
http://support.microsoft.com/kb/315939
"This behavior can occur if two threads simultaneously create child
processes and redirect the STD handles through pipes. In this
scenario, there is a race condition during the creation of the pipes
and processes, in which it is possible for one child to inherit file
handles intended for the other child."
=> Python looks to be correct, it uses the (StartupInfo)
"STARTF_USESTDHANDLES" flag
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6428742
Java is still calling CreateProcess() with bInheritHandles=TRUE which
causes issues like "6347873: (so) Ports opened with
ServerSocketChannel blocks when using Runtime.exec". Interesting
comment: "Submitter has a point. Very risky to fix." :-/
For the record, the default value of close_fds parameter was also set
to True on Windows by the following changeset:
changeset: 66889:59b43dc34158
user: Gregory P. Smith
The PEP 466 allows you to control which handles are inherited to child process when you use subprocess with close_fds=False. (The subprocess parameter should be called "close_handles" on Windows to avoid confusion.)
Another advantage of the PEP 446 is that most of the time, the HANDLE_FLAG_INHERIT flag value can be set atomatically at the creation of the file descriptor (creation of the handle). It is a nice enhancement to fight against race conditions with threads ;-) "Most of the time": for example, socket are inherited by default, WSA_FLAG_NO_HANDLE_INHERIT flag was only added to Windows Vista. Victor
On 23 July 2013 23:45, Victor Stinner
Said differently: the HANDLE_FLAG_INHERIT flag only has an effect on *handles*, as indicated in its name. On Windows, file *descriptors* are never inherited (are always closed) in child processes. I don't think that it is possible to inherit file descriptors on Windows.
That is correct - handles are the OS-level concept, fds are implemented in the CRT. So code that uses raw Windows APIs to create a new process won't have any means to inherit fds.
The fix would be simpler if Python would provide the handle of a file object (ex: in a method) and if open() supported opening a handle as it does with file descriptors on UNIX.
That would give a similar level of functionality to Unix. Whether it is used sufficiently often to be worth it, is a separate question, of course... Paul
participants (11)
-
Antoine Pitrou
-
Ben Finney
-
Cameron Simpson
-
Eric V. Smith
-
Gregory P. Smith
-
Guido van Rossum
-
Nick Coghlan
-
Paul Moore
-
Richard Oudkerk
-
Ronald Oussoren
-
Victor Stinner