PEP 433: Add cloexec argument to functions creating file descriptors
HTML version:
http://www.python.org/dev/peps/pep-0433/
***
PEP: 433
Title: Add cloexec argument to functions creating file descriptors
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner
PEP: 433 Title: Add cloexec argument to functions creating file descriptors Status: Draft
The PEP is still a draft. I'm sending it to python-dev to get a first review. The main question is the choice between the 3 different options: * don't set close-on-exec flag by default * always set close-on-exec flag * add sys.setdefaultcloexec() to leave the choice to the application Victor
On 13/01/2013 00:41, Victor Stinner wrote:
PEP: 433 Title: Add cloexec argument to functions creating file descriptors Status: Draft The PEP is still a draft. I'm sending it to python-dev to get a first review.
The main question is the choice between the 3 different options:
* don't set close-on-exec flag by default * always set close-on-exec flag * add sys.setdefaultcloexec() to leave the choice to the application
Victor Nice clear explanation.
I think io, meaning _io and _pyio really, would be amongst the impacted modules, and should perhaps be in the examples. (I am currently working on the Jython implementation of the _io module.) It seems to me that io.open, and probably all the constructors, such as _io.FileIO, would need the extra information as a mode or a boolean argument like closefd. This may be a factor in your choice above. Other things I noticed were minor, and I infer that they should wait until principles are settled. Jeff Allen
2013/1/13 Jeff Allen
I think io, meaning _io and _pyio really, would be amongst the impacted modules, and should perhaps be in the examples. (I am currently working on the Jython implementation of the _io module.) It seems to me that io.open, and probably all the constructors, such as _io.FileIO, would need the extra information as a mode or a boolean argument like closefd. This may be a factor in your choice above.
open() is listed in the PEP: open is io.open. I plan to add cloexec parameter to open() and FileIO constructor (and so io.open and _pyio.open). Examples: rawfile = io.FileIO("test.txt", "r", cloexec=True) textfile = open("text.txt", "r", cloexec=True) I started an implementation, you can already test FileIO and open(): http://hg.python.org/features/pep-433 Victor
On 17/01/2013 13:02, Victor Stinner wrote:
I think io, meaning _io and _pyio really, would be amongst the impacted modules, and should perhaps be in the examples. (I am currently working on the Jython implementation of the _io module.) It seems to me that io.open, and probably all the constructors, such as _io.FileIO, would need the extra information as a mode or a boolean argument like closefd. This may be a factor in your choice above. open() is listed in the PEP: open is io.open. I plan to add cloexec
2013/1/13 Jeff Allen
: parameter to open() and FileIO constructor (and so io.open and _pyio.open). Examples: rawfile = io.FileIO("test.txt", "r", cloexec=True) textfile = open("text.txt", "r", cloexec=True) Ok it fell under "too obvious to mention". And my ignorance of v3.x, sorry. (We're still working on 2.7 in Jython, where open is from __builtin__.) Thanks for the reply.
Jeff Allen
Hello,
PEP: 433 Title: Add cloexec argument to functions creating file descriptors
I'm not a native English speaker, but it seems to me that the correct wording should be "parameter" (part of the function definition/prototype, whereas "argument" refers to the actual value supplied).
This PEP proposes to add a new optional argument ``cloexec`` on functions creating file descriptors in the Python standard library. If the argument is ``True``, the close-on-exec flag will be set on the new file descriptor.
It would probably be useful to recap briefly what the close-on-exec flag does. Also, ISTM that Windows also supports this flag. If it does, then "cloexec" might not be the best name, because it refers to the execve() Unix system call. Maybe something like "noinherit" would be clearer (although coming from a Unix background "cloexec" is crystal-clear to me :-).
On UNIX, subprocess closes file descriptors greater than 2 by default since Python 3.2 [#subprocess_close]_. All file descriptors created by the parent process are automatically closed.
("in the child process")
``xmlrpc.server.SimpleXMLRPCServer`` sets the close-on-exec flag of the listening socket, the parent class ``socketserver.BaseServer`` does not set this flag.
As has been discussed earlier, the real issue is that the server socket is not closed in the child process. Setting it cloexec would only add an extra security for multi-threaded programs.
Inherited file descriptors issues ---------------------------------
Closing the file descriptor in the parent process does not close the related resource (file, socket, ...) because it is still open in the child process.
You might want to go through the bug tracker to find examples of such issues, and list them: http://bugs.python.org/issue7213 http://bugs.python.org/issue12786 http://bugs.python.org/issue2320 http://bugs.python.org/issue3006 The list goes on. Some of those examples resulted in deadlocks.
The listening socket of TCPServer is not closed on ``exec()``: the child process is able to get connection from new clients; if the parent closes the listening socket and create a new listening socket on the same address, it would get an "address already is used" error.
See above for the real cause.
Not closing file descriptors can lead to resource exhaustion: even if the parent closes all files, creating a new file descriptor may fail with "too many files" because files are still open in the child process.
You might want to detail the course of events (a child if forked before the parent gets a chance to close the file descriptors... EMFILE).
Leaking file descriptors is a major security vulnerability. An untrusted child process can read sensitive data like passwords and take control of the parent process though leaked file descriptors. It is for example a known vulnerability to escape from a chroot.
You might add a link to this: https://www.securecoding.cert.org/confluence/display/seccode/FIO42-C.+Ensure... It can also result in DoS (if the child process highjacks the server socket and accepts connections). Example of vulnerabilities: http://www.openssh.com/txt/portable-keysign-rand-helper.adv http://www.securityfocus.com/archive/1/348368 http://cwe.mitre.org/data/definitions/403.html
The problem is that these flags and functions are not portable: only recent versions of operating systems support them. ``O_CLOEXEC`` and ``SOCK_CLOEXEC`` flags are ignored by old Linux versions and so ``FD_CLOEXEC`` flag must be checked using ``fcntl(fd, F_GETFD)``. If the kernel ignores ``O_CLOEXEC`` or ``SOCK_CLOEXEC`` flag, a call to ``fcntl(fd, F_SETFD, flags)`` is required to set close-on-exec flag.
.. note:: OpenBSD older 5.2 does not close the file descriptor with close-on-exec flag set if ``fork()`` is used before ``exec()``, but it works correctly if ``exec()`` is called without ``fork()``.
That would be *really* surprising, are your sure your test case is correct? Otherwise it could be a compilation issue, because I simply can't believe OpenBSD would ignore the close-on-exec flag.
This PEP only change the close-on-exec flag of file descriptors created by the Python standard library, or by modules using the standard library. Third party modules not using the standard library should be modified to conform to this PEP. The new ``os.set_cloexec()`` function can be used for example.
Impacted functions:
* ``os.forkpty()`` * ``http.server.CGIHTTPRequestHandler.run_cgi()``
I've opened http://bugs.python.org/issue16945 to rewrite this to use subprocess.
Impacted modules:
* ``multiprocessing`` * ``socketserver`` * ``subprocess`` * ``tempfile``
Hum, I thought temporay file are already created with the close-on-exec flag.
* ``xmlrpc.server`` * Maybe: ``signal``, ``threading``
XXX Should ``subprocess.Popen`` set the close-on-exec flag on file XXX XXX descriptors of the constructor the ``pass_fds`` argument? XXX
What? Setting them cloexec would prevent them from being inherited in the child process!
Add a new optional ``cloexec`` argument to:
* Maybe also: ``os.open()``, ``os.openpty()``
Open can be passed O_CLOEXEC directly.
* Many functions of the Python standard library creating file descriptors are cannot be changed by this proposal, because adding a ``cloexec`` optional argument would be surprising and too many functions would need it. For example, ``os.urandom()`` uses a temporary file on UNIX, but it calls a function of Windows API on Windows. Adding a ``cloexec`` argument to ``os.urandom()`` would not make sense. See `Always set close-on-exec flag`_ for an incomplete list of functions creating file descriptors. * Checking if a module creates file descriptors is difficult. For example, ``os.urandom()`` creates a file descriptor on UNIX to read ``/dev/urandom`` (and closes it at exit), whereas it is implemented using a function call on Windows. It is not possible to control close-on-exec flag of the file descriptor used by ``os.urandom()``, because ``os.urandom()`` API does not allow it.
I think that the rule of thumb should be simple: if a library opens a file descriptor which is not exposed to the user, either because it's opened and closed before returning (e.g. os.urandom()) or the file descriptor is kept private (e.g. poll(), it should be set close-on-exec. Because the FD is not handed over to the user, there's no risk of breaking applications: that's what the glibc does (e.g. for getpwnam(), so you're sure you don't leak an open FD to /etc/passwd).
Always set close-on-exec flag on new file descriptors created by Python. This alternative just changes the default value of the new ``cloexec`` argument.
In a perfect world, all FDS should have been cloexec by default. But it's too late now, I don't think we can change the default, it's going to break some applications, and would be a huge deviation from POSIX (however broken this design decision is).
Add a function to set close-on-exec flag by default ---------------------------------------------------
An alternative is to add also a function to change globally the default behaviour. It would be possible to set close-on-exec flag for the whole application including all modules and the Python standard library. This alternative is based on the `Proposal`_ and adds extra changes.
* It is not more possible to know if the close-on-exec flag will be set or not just by reading the source code.
That's really a show stopper: """ s = socket.socket() if os.fork() == 0: # child os.execve(['myprog', 'arg1]) else: # parent [...] """ It would be impossible to now if the socket is inherited, because the behavior of the all program is affected by an - hidden - global variable. That's just too wrong. Also, it has the same drawbacks as global variables: not thread-safe, not library-safe (i.e. if two libraries set it to conflicting values, you don't know which one is picked up).
Close file descriptors after fork ---------------------------------
This PEP does not fix issues with applications using ``fork()`` without ``exec()``. Python needs a generic process to register callbacks which would be called after a fork, see `Add an 'afterfork' module`_. Such registry could be used to close file descriptors just after a ``fork()``.
An atfork() module would indeed be really useful, but I don't think it should be used for closing file descriptors: file descriptors are a scarce resource, so should be closed as soon as possible, explicitly. Also, you could end up actually leaking file descriptors just by keeping a reference to them in the atfork module (that's why it should probably use weakrefs, but that's another debate). The atfork mecanism is useful for patterns where some resource is acquired/open in a library, and needs reiniatialization in a child process (e.g. locks to avoid deadlocks, or random seed), not to encourage lazy programming.
Applications using inherance of file descriptors ================================================
Most developers don't know that file descriptors are inherited by default. Most programs do not rely on inherance of file descriptors. For example, ``subprocess.Popen`` was changed in Python 3.2 to close all file descriptors greater than 2 in the child process by default. No user complained about this behavior change.
"yet" ;-)
Network servers using fork may want to pass the client socket to the child process. For example, on UNIX a CGI server pass the socket client through file descriptors 0 (stdin) and 1 (stdout) using ``dup2()``. This specific case is not impacted by this PEP because the close-on-exec flag is never set on file descriptors smaller than 3.
CGI servers, inetd servers, etc.
Example of programs taking file descriptors from the parent process using a command line option:
* gpg: ``--status-fd <fd>``, ``--logger-fd <fd>``, etc. * openssl: ``-pass fd:<fd>`` * qemu: ``-add-fd <fd>`` * valgrind: ``--log-fd=<fd>``, ``--input-fd=<fd>``, etc. * xterm: ``-S <fd>``
Hum, yes, but since the official way to start new process is through the subprocess module, I expect that all python code wrapping those programs is broken since subprocess "close_fds" parameter has been changed to "True" in 3.2.
On Sun, Jan 13, 2013 at 2:40 AM, Charles-François Natali
Hello,
PEP: 433 Title: Add cloexec argument to functions creating file descriptors
I'm not a native English speaker, but it seems to me that the correct wording should be "parameter" (part of the function definition/prototype, whereas "argument" refers to the actual value supplied).
Yes, this distinction is now reflected in our glossary as of a month or two ago. Let's try to be consistent with that. :) --Chris
On Sun, Jan 13, 2013 at 8:40 PM, Charles-François Natali
Hello,
PEP: 433 Title: Add cloexec argument to functions creating file descriptors
I'm not a native English speaker, but it seems to me that the correct wording should be "parameter" (part of the function definition/prototype, whereas "argument" refers to the actual value supplied).
This is correct (although it's a subtle distinction that even many native English speakers get wrong). For the PEP, I'd actually avoid assuming the solution in the title and instead say something like "Easier suppression of file descriptor inheritance"
This PEP proposes to add a new optional argument ``cloexec`` on functions creating file descriptors in the Python standard library. If the argument is ``True``, the close-on-exec flag will be set on the new file descriptor.
It would probably be useful to recap briefly what the close-on-exec flag does.
Also, ISTM that Windows also supports this flag. If it does, then "cloexec" might not be the best name, because it refers to the execve() Unix system call. Maybe something like "noinherit" would be clearer (although coming from a Unix background "cloexec" is crystal-clear to me :-).
Indeed, this may be an area where following the underlying standards too closely may not be a good idea. In particular, a *descriptive* flag may be better choice than an imperative one. For example, if we make the flag "sensitive", then the programmer is telling us "this file descriptor is sensitive" and then we get to decide what that means in terms of the underlying OS behaviours like "close-on-exec" and "no-inherit" (as well as deciding whether or not file descriptors are considered sensitive by default). It also means we're free to implement a mechanism that tries to close all sensitive file descriptors in _PyOS_AfterFork. Alternatively, you could flip the sense of the flag and use "inherited" - the scope is then clearly limited to indicating whether or not the file descriptor should be inherited by child processes. Either way, if done with sufficient advance warning and clear advice to users, the "It's a systematic fix to adopt secure-by-default behaviour" excuse may actually let it get by, especially if the appropriate stdlib APIs adopt "inherited-by-default" for the FDs where it is needed (such as subprocess pass_fds).
Add a new optional ``cloexec`` argument to:
* Maybe also: ``os.open()``, ``os.openpty()``
Open can be passed O_CLOEXEC directly.
Indeed, people playing at the os module layer are already expected to have to deal with platform specific behaviour. That's probably a reasonable model to continue here - if people want cross-platform handling of sensitive file descriptors, they either have to handle it themselves or step up a layer of abstraction.
Also, it has the same drawbacks as global variables: not thread-safe, not library-safe (i.e. if two libraries set it to conflicting values, you don't know which one is picked up).
I think it makes sense to consider the two issues separately, while making sure the design in this PEP supports both "given inherited FD by default, mark individual ones that shouldn't be inherited" and "given non-inherited FD by default, mark individual ones for inheritance" Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sun, 13 Jan 2013 21:13:42 +1000
Nick Coghlan
Also, ISTM that Windows also supports this flag. If it does, then "cloexec" might not be the best name, because it refers to the execve() Unix system call. Maybe something like "noinherit" would be clearer (although coming from a Unix background "cloexec" is crystal-clear to me :-).
Indeed, this may be an area where following the underlying standards too closely may not be a good idea. In particular, a *descriptive* flag may be better choice than an imperative one.
For example, if we make the flag "sensitive", then the programmer is telling us "this file descriptor is sensitive" and then we get to decide what that means in terms of the underlying OS behaviours like "close-on-exec" and "no-inherit" (as well as deciding whether or not file descriptors are considered sensitive by default).
It also means we're free to implement a mechanism that tries to close all sensitive file descriptors in _PyOS_AfterFork.
Ouch! This actually shows that "noinherit" is a very bad name. The PEP is about closing fds after exec(), *not* after fork(). So "cloexec" is really the right, precise, non-ambiguous name here. Regards Antoine.
On Sun, Jan 13, 2013 at 9:15 PM, Antoine Pitrou
It also means we're free to implement a mechanism that tries to close all sensitive file descriptors in _PyOS_AfterFork.
Ouch! This actually shows that "noinherit" is a very bad name. The PEP is about closing fds after exec(), *not* after fork(). So "cloexec" is really the right, precise, non-ambiguous name here.
No, 'cloexec' is a terrible name, because, aside from the cryptic opacity of it, it's also wrong on Windows, which doesn't have the fork() vs exec() distinction. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sun, 13 Jan 2013 21:33:30 +1000
Nick Coghlan
On Sun, Jan 13, 2013 at 9:15 PM, Antoine Pitrou
wrote: It also means we're free to implement a mechanism that tries to close all sensitive file descriptors in _PyOS_AfterFork.
Ouch! This actually shows that "noinherit" is a very bad name. The PEP is about closing fds after exec(), *not* after fork(). So "cloexec" is really the right, precise, non-ambiguous name here.
No, 'cloexec' is a terrible name, because, aside from the cryptic opacity of it, it's also wrong on Windows, which doesn't have the fork() vs exec() distinction.
Just because Windows doesn't have the fork() vs exec() distinction doesn't mean "cloexec" is a bad description of the flag: it describes exactly what happens (the fd gets closed when executing an external program). As for the opacity, feel free to propose something better ("close_on_spawn", whatever). But I'm definitely and strongly -1 on "noinherit". Regards Antoine.
On Sun, Jan 13, 2013 at 9:43 PM, Antoine Pitrou
As for the opacity, feel free to propose something better ("close_on_spawn", whatever). But I'm definitely and strongly -1 on "noinherit".
That's the main reason I quite like "sensitive" as a term for this, since it decouples the user statement ("this file descriptor provides access to potentially sensitive information") from the steps the interpreter promises to take to protect that information (such as closing it before executing a different program or ensuring it isn't inherited by child processes). We can then define a glossary entry for "sensitive" that explains the consequences of flagging a descriptor as sensitive on the various operating systems (i.e. setting cloexec on POSIX and noinherit on Windows). As the platforms provide additional security mechanisms, we can provide them without needing to change the user facing APIs. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sun, 13 Jan 2013 22:44:06 +1000
Nick Coghlan
On Sun, Jan 13, 2013 at 9:43 PM, Antoine Pitrou
wrote: As for the opacity, feel free to propose something better ("close_on_spawn", whatever). But I'm definitely and strongly -1 on "noinherit".
That's the main reason I quite like "sensitive" as a term for this, since it decouples the user statement ("this file descriptor provides access to potentially sensitive information") from the steps the interpreter promises to take to protect that information (such as closing it before executing a different program or ensuring it isn't inherited by child processes).
This assumes that some file descriptors are not "sensitive", which sounds a bit weird to me (since a fd will by definition give access to a system resource). What should happen is that *no* file descriptors are inherited on exec(), except for those few ones which are necessary for proper operation of the exec()ed process. (it's not even just a security issue: letting a bound socket open and therefore being unable to re-use the same port is a bug even when security is not a concern) Regards Antoine.
On Sun, Jan 13, 2013 at 11:22 PM, Antoine Pitrou
On Sun, 13 Jan 2013 22:44:06 +1000 Nick Coghlan
wrote: On Sun, Jan 13, 2013 at 9:43 PM, Antoine Pitrou
wrote: As for the opacity, feel free to propose something better ("close_on_spawn", whatever). But I'm definitely and strongly -1 on "noinherit".
That's the main reason I quite like "sensitive" as a term for this, since it decouples the user statement ("this file descriptor provides access to potentially sensitive information") from the steps the interpreter promises to take to protect that information (such as closing it before executing a different program or ensuring it isn't inherited by child processes).
This assumes that some file descriptors are not "sensitive", which sounds a bit weird to me (since a fd will by definition give access to a system resource). What should happen is that *no* file descriptors are inherited on exec(), except for those few ones which are necessary for proper operation of the exec()ed process.
I agree it makes it obvious what the right default behaviour should be: flag every FD as sensitive by default, and pass an argument to say "sensitive=False" when you want to disable Python's automatic protections. However, there's a pragmatic question of whether we're willing to eat the backwards incompatibility of such a change in a feature release, or if it's something that has to wait until Python 4.0, or else be selected via a global flag (in line with the hash randomisation change).
(it's not even just a security issue: letting a bound socket open and therefore being unable to re-use the same port is a bug even when security is not a concern)
Agreed, but it's the security implications that let us even contemplate the backwards compatibility break. We either let inexperienced users continue to write insecure software by default, or we close the loophole and tell experienced users "hey, to upgrade to Python 3.4, you will need to address this change in behaviour". The nice thing is that with enough advance warning, they should be able to update their code to forcibly clear the flag in a way that works even on earlier Python versions. A more conservative approach, based on the steps taken in introducing hash randomisation, would be to expose the setting as an environment variable in 3.4, and then switch the default behaviour in 3.5. So, 3.3: FDs non-sensitive by default, use O_CLOEXEC, noinherit, etc directly for sensitive ones 3.4: FDs non-sensitive by default, set "sensitive=True" to enable selectively, or set PYTHONSENSITIVEFDS to enable globally and "sensitive=False" to disable selectively 3.5: FDs sensitive by default, set "sensitive=False" to enable selectively, no PYTHONSENSITIVEFDS setting (To accelerate the adoption schedule, replace 3.5 with 3.4, and 3.4 with 3.3.1, but require that people set or clear the platform specific flags to work around the default behaviour) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sun, 13 Jan 2013 23:49:32 +1000
Nick Coghlan
(it's not even just a security issue: letting a bound socket open and therefore being unable to re-use the same port is a bug even when security is not a concern)
Agreed, but it's the security implications that let us even contemplate the backwards compatibility break. We either let inexperienced users continue to write insecure software by default, or we close the loophole and tell experienced users "hey, to upgrade to Python 3.4, you will need to address this change in behaviour".
The nice thing is that with enough advance warning, they should be able to update their code to forcibly clear the flag in a way that works even on earlier Python versions.
A more conservative approach, based on the steps taken in introducing hash randomisation, would be to expose the setting as an environment variable in 3.4, and then switch the default behaviour in 3.5.
The "more conservative approach" sounds good to me :-) Regards Antoine.
Nick Coghlan wrote:
Agreed, but it's the security implications that let us even contemplate the backwards compatibility break.
I don't think that's a sufficient criterion for choosing a name. The backwards compatibility issue is a transitional thing, but we'll be stuck with the name forever. IMO the name should simply describe the mechanism, and be neutral as to what the reason might be for using the mechanism. -- Greg
On Mon, Jan 14, 2013 at 8:06 AM, Greg Ewing
Nick Coghlan wrote:
Agreed, but it's the security implications that let us even contemplate the backwards compatibility break.
I don't think that's a sufficient criterion for choosing a name. The backwards compatibility issue is a transitional thing, but we'll be stuck with the name forever.
IMO the name should simply describe the mechanism, and be neutral as to what the reason might be for using the mechanism.
The problem is the mechanism is *not the same* on Windows and POSIX - the Windows mechanism (noinherit) means the file won't be accessible in child processes, the POSIX mechanism (cloexec) means it won't survive exec(). This difference really shows up with multiprocessing, as that uses a bare fork() without exec() on POSIX systems, meaning even flagged descriptors will still be inherited by the child processes. A flag like "sensitive" or "protected" or "private" tells the interpreter "keep this to yourself, as best you can", leaving Python implementors with a certain amount of freedom in the way they translate that into platform appropriate settings (e.g. does "cloexec" or "noinherit" even make sense in Java?) I agree with Glenn's point that "sensitive" in particular is a bad choice, but a more value neutral term like "protect" may still work. We can then explain what it means to protect a file descriptor on the various platforms, and potentially even change the meaning over time as the platforms provide additional capabilities. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Nick Coghlan wrote:
The problem is the mechanism is *not the same* on Windows and POSIX - the Windows mechanism (noinherit) means the file won't be accessible in child processes, the POSIX mechanism (cloexec) means it won't survive exec().
But since there is no such thing as a fork without exec on Windows, the term "cloexec" is still correct. The fd is closed during the exec part of the fork+exec implied by spawning a child process.
This difference really shows up with multiprocessing, as that uses a bare fork() without exec() on POSIX systems, meaning even flagged descriptors will still be inherited by the child processes.
There are a lot of other ways that this difference will be visible as well, so the difference in cloexec behaviour just needs to be documented in the multiprocessing module along with the rest.
I agree with Glenn's point that "sensitive" in particular is a bad choice, but a more value neutral term like "protect" may still work.
I think "protect" is *far* too vague. We don't want something so neutral that it gives no clue at all about its meaning. -- Greg
On 01/14/2013 01:44 PM, Greg Ewing wrote:
I think "protect" is *far* too vague. We don't want something so neutral that it gives no clue at all about its meaning.
My shoot-from-the-hip name suggestion: "sterile". (Does not produce offspring.) When a process and a file descriptor love each other *very much*... //arry/
On 1/13/2013 5:49 AM, Nick Coghlan wrote:
I agree it makes it obvious what the right default behaviour should be: flag every FD as sensitive by default, and pass an argument to say "sensitive=False" when you want to disable Python's automatic protections.
"sensitive" is a bad name... because the data passed via an open descriptor that _should_ be passed may well be sensitive, so saying sensitive=False is false... it's just necessary to leave the descriptor open to make things work...
On Sun, Jan 13, 2013 at 8:40 PM, Charles-François Natali
XXX Should ``subprocess.Popen`` set the close-on-exec flag on file XXX XXX descriptors of the constructor the ``pass_fds`` argument? XXX
What? Setting them cloexec would prevent them from being inherited in the child process!
Perhaps Victor meant "clear" rather than "set" in this comment? Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Hi,
Thanks for your feedback, I already updated the PEP for most of your remarks.
2013/1/13 Charles-François Natali
Also, ISTM that Windows also supports this flag.
Yes it does, see Appendix: Operating system support > Windows.
.. note:: OpenBSD older 5.2 does not close the file descriptor with close-on-exec flag set if ``fork()`` is used before ``exec()``, but it works correctly if ``exec()`` is called without ``fork()``.
That would be *really* surprising, are your sure your test case is correct? Otherwise it could be a compilation issue, because I simply can't believe OpenBSD would ignore the close-on-exec flag.
I had this issue with a short Python script. I will try to reproduce it with a C program to double check.
Impacted modules:
* ``multiprocessing`` * ``socketserver`` * ``subprocess`` * ``tempfile``
Hum, I thought temporay file are already created with the close-on-exec flag.
"Impacted" is maybe not the best word. I mean that these modules should be modified or that their behaviour may change. It's more a TODO list to ensure that these modules are consistent with PEP.
XXX Should ``subprocess.Popen`` set the close-on-exec flag on file XXX XXX descriptors of the constructor the ``pass_fds`` argument? XXX
What? Setting them cloexec would prevent them from being inherited in the child process!
Oops, it's just the opposite: pass_fds should (must?) *clear* the flag :-) (I'm not sure of what should be done here.)
Add a new optional ``cloexec`` argument to:
* Maybe also: ``os.open()``, ``os.openpty()``
Open can be passed O_CLOEXEC directly.
See Portability section of Rational: O_CLOEXEC is not portable, even under Linux! It's tricky to use these atomic flags and to implement a fallback. It looks like most developers agree that we should expose an helper or something like that to access the "cloexec" feature for os.open(), os.pipe(), etc. I propose to add it directly in the os module because I don't want to add a new function just for that. (In the kernel/libc) OS functions don't support keyword arguments, so I consider cloexec=True explicit enough. The parameter may be a keyword-only argument, I chose a classic keyword parameter because it's simpler to use them in C. If you don't want to add extra syscalls to os functions using an optional keyword parameter, what do you propose to expose cloexec features to os.open(), os.pipe(), etc.? A new functions? A new module?
* Many functions of the Python standard library creating file descriptors are cannot be changed by this proposal, because adding a ``cloexec`` optional argument would be surprising and too many functions would need it. For example, ``os.urandom()`` uses a temporary file on UNIX, but it calls a function of Windows API on Windows. Adding a ``cloexec`` argument to ``os.urandom()`` would not make sense. See `Always set close-on-exec flag`_ for an incomplete list of functions creating file descriptors. * Checking if a module creates file descriptors is difficult. For example, ``os.urandom()`` creates a file descriptor on UNIX to read ``/dev/urandom`` (and closes it at exit), whereas it is implemented using a function call on Windows. It is not possible to control close-on-exec flag of the file descriptor used by ``os.urandom()``, because ``os.urandom()`` API does not allow it.
I think that the rule of thumb should be simple: if a library opens a file descriptor which is not exposed to the user, either because it's opened and closed before returning (e.g. os.urandom()) or the file descriptor is kept private (e.g. poll(), it should be set close-on-exec.
Ok, it looks fair. I agree *but* only if the file descriptor is closed when the function does exit. select.epoll() doesn't return directly a file descriptor, but an object having a fileno() method. A server may rely on the inherance of this file descriptor, we cannot set close-on-exec flag in this case (if the default is False). It becomes unclear if a function returns an opaque object which contains a file descriptor, but the file descriptor is not accessible. I don't know if Python contains such function.
Always set close-on-exec flag on new file descriptors created by Python. This alternative just changes the default value of the new ``cloexec`` argument.
In a perfect world, all FDS should have been cloexec by default. But it's too late now, I don't think we can change the default, it's going to break some applications, and would be a huge deviation from POSIX (however broken this design decision is).
I disagree, according to all emails exchanged: only a very few applications rely on file descriptor inherance. Most of them are already using subprocess with pass_fds, and it should be easy to fix the last applications. I'm not sure that it's so different than the change on subprocess (close_fds=True by default on UNIX since Python 3.2). Do you think that it would break more applications? I disagree but I'm waiting other votes :-)
Add a function to set close-on-exec flag by default ---------------------------------------------------
... * It is not more possible to know if the close-on-exec flag will be set or not just by reading the source code.
That's really a show stopper:
""" s = socket.socket() if os.fork() == 0: # child os.execve(['myprog', 'arg1]) else: # parent [...] """
It would be impossible to now if the socket is inherited, because the behavior of the all program is affected by an - hidden - global variable. That's just too wrong.
For most file descriptors, the application doesn't care on inherance. For the few file descriptors that must be inherited, or not inherited, you can use the explicit flag: s = socket.socket(cloexec=True) The idea of a global option is to not slow down programs not using exec() nor fork() at all. It would also allow users to not have to "fix" their applications if their applications are not "cloexec compliant". By the way, I don't know how this parameter will be specific in the code if the code needs to be compatible with Python < 3.4. With a check on the Python version?
Also, it has the same drawbacks as global variables: not thread-safe, not library-safe (i.e. if two libraries set it to conflicting values, you don't know which one is picked up).
What do you mean by "thread-safe"? "library-safe": only applications "should" use sys.setdefaultcloexec(). If a library uses this function, I guess that it would *enable* close-on-exec. If two libraries enable it, it works. If they disagree... something is wrong :-) But I agree that it's not the best solution. Setting cloexec to True (or False) by default is simpler ;-)
An atfork() module would indeed be really useful, but I don't think it should be used for closing file descriptors: file descriptors are a scarce resource, ...
I agree but Antoine proposed something like that, and I would like to list all proposed alternatives. If I don't, someone will ask why the "atfork" solution was not taken :-) I will add your argument to explain why this alternative was not chosen. Victor
On Mon, Jan 14, 2013 at 9:23 PM, Victor Stinner
I'm not sure that it's so different than the change on subprocess (close_fds=True by default on UNIX since Python 3.2). Do you think that it would break more applications?
Don't ignore the possible explanation that the intersection of the sets "users with applications affected by that change" and "users migrating to Python 3.x" may currently be close to zero.
I disagree but I'm waiting other votes :-)
I'm a fan of the conservative approach, with an environment variable and command line option to close FDs by default in 3.4 (similar to PYTHONHASHSEED and -R in the pre-3.3 security releases), and the cloexec/noinherit behaviour becoming the default (with no way to turn it off globally) in 3.5. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
2013/1/14 Nick Coghlan
I'm a fan of the conservative approach, with an environment variable and command line option to close FDs by default in 3.4 (similar to PYTHONHASHSEED and -R in the pre-3.3 security releases), and the cloexec/noinherit behaviour becoming the default (with no way to turn it off globally) in 3.5.
Do you mean "environment variable and command line option" *instead
of* a new sys.setdefaultcloexec() function?
An environment variable and a command line option have an advantage
over a function: libraries cannot modify the value at runtime (so 2
libraries cannot set different values :-)).
2013/1/14 Nick Coghlan
Turning off a security feature implicitly isn't a good idea. If someone passes such a descriptor, their child application will fail noisily - it's then up to the developer to decide if they passed the wrong file descriptor, or simply need to ensure the one they passed remains open in the child process.
For my subprocess/pass_fds comment: I wrote it initially while the PEP was proposing to setting close-on-exec flag by default. I will move this comment to the "Set close-on-exec flag by default". Victor
On Mon, Jan 14, 2013 at 9:23 PM, Victor Stinner
XXX Should ``subprocess.Popen`` set the close-on-exec flag on file XXX XXX descriptors of the constructor the ``pass_fds`` argument? XXX
What? Setting them cloexec would prevent them from being inherited in the child process!
Oops, it's just the opposite: pass_fds should (must?) *clear* the flag :-) (I'm not sure of what should be done here.)
Turning off a security feature implicitly isn't a good idea. If someone passes such a descriptor, their child application will fail noisily - it's then up to the developer to decide if they passed the wrong file descriptor, or simply need to ensure the one they passed remains open in the child process. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
2013/1/13 Charles-François Natali
.. note:: OpenBSD older 5.2 does not close the file descriptor with close-on-exec flag set if ``fork()`` is used before ``exec()``, but it works correctly if ``exec()`` is called without ``fork()``.
That would be *really* surprising, are your sure your test case is correct? Otherwise it could be a compilation issue, because I simply can't believe OpenBSD would ignore the close-on-exec flag.
I didn't write a C program yet, but you can test the folllowing Python script. On OpenBSD 4.9 it writes "OS BUG !!!". -- USE_FORK = True import fcntl, os, sys fd = os.open("/etc/passwd", os.O_RDONLY) flags = fcntl.fcntl(fd, fcntl.F_GETFD) flags |= fcntl.FD_CLOEXEC fcntl.fcntl(fd, fcntl.F_SETFD, flags) code = """ import os, sys fd = int(sys.argv[1]) try: os.fstat(fd) except OSError: print("fd %s closed by exec (FD_CLOEXEC works)" % fd) else: print("fd %s not closed by exec: FD_CLOEXEC doesn't work, OS BUG!!!" % fd) """ args = [sys.executable, '-c', code, str(fd)] if USE_FORK: pid = os.fork() if pid: os.waitpid(pid, 0) sys.exit(0) os.execv(args[0], args) -- It works with USE_FORKS = False. Victor
Alternatives ============
Always set close-on-exec flag -----------------------------
Oops, this is completly wrong. This section is basically exactly the same than the PEP proposal, except that the default value of cloexec is set to True. The correct title is: "Set the close-on-exec flag by default" (I just fixed it). Victort
participants (9)
-
Antoine Pitrou
-
Charles-François Natali
-
Chris Jerdonek
-
Glenn Linderman
-
Greg Ewing
-
Jeff Allen
-
Larry Hastings
-
Nick Coghlan
-
Victor Stinner