[Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors

Victor Stinner victor.stinner at gmail.com
Mon Jan 14 12:23:13 CET 2013


Hi,

Thanks for your feedback, I already updated the PEP for most of your remarks.

2013/1/13 Charles-François Natali <cf.natali at gmail.com>:
> Also, ISTM that Windows also supports this flag.

Yes it does, see Appendix: Operating system support > Windows.

>> .. note::
>>    OpenBSD older 5.2 does not close the file descriptor with
>>    close-on-exec flag set if ``fork()`` is used before ``exec()``, but
>>    it works correctly if ``exec()`` is called without ``fork()``.
>
> That would be *really* surprising, are your sure your test case is correct?
> Otherwise it could be a compilation issue, because I simply can't
> believe OpenBSD would ignore the close-on-exec flag.

I had this issue with a short Python script. I will try to reproduce
it with a C program to double check.

>> Impacted modules:
>>
>>  * ``multiprocessing``
>>  * ``socketserver``
>>  * ``subprocess``
>>  * ``tempfile``
>
> Hum, I thought temporay file are already created with the close-on-exec flag.

"Impacted" is maybe not the best word. I mean that these modules
should be modified or that their behaviour may change. It's more a
TODO list to ensure that these modules are consistent with PEP.

>> XXX Should ``subprocess.Popen`` set the close-on-exec flag on file XXX
>> XXX descriptors of the constructor the ``pass_fds`` argument?      XXX
>
> What?
> Setting them cloexec would prevent them from being inherited in the
> child process!

Oops, it's just the opposite: pass_fds should (must?) *clear* the flag
:-) (I'm not sure of what should be done here.)

>> Add a new optional ``cloexec`` argument to:
>>
>>  * Maybe also: ``os.open()``, ``os.openpty()``
>
> Open can be passed O_CLOEXEC directly.

See Portability section of Rational: O_CLOEXEC is not portable, even
under Linux! It's tricky to use these atomic flags and to implement a
fallback.

It looks like most developers agree that we should expose an helper or
something like that to access the "cloexec" feature for os.open(),
os.pipe(), etc.

I propose to add it directly in the os module because I don't want to
add a new function just for that. (In the kernel/libc) OS functions
don't support keyword arguments, so I consider cloexec=True explicit
enough. The parameter may be a keyword-only argument, I chose a
classic keyword parameter because it's simpler to use them in C.

If you don't want to add extra syscalls to os functions using an
optional keyword parameter, what do you propose to expose cloexec
features to os.open(), os.pipe(), etc.? A new functions? A new module?

>>  * Many functions of the Python standard library creating file
>>    descriptors are cannot be changed by this proposal, because adding
>>    a ``cloexec`` optional argument would be surprising and too many
>>    functions would need it. For example, ``os.urandom()`` uses a
>>    temporary file on UNIX, but it calls a function of Windows API on
>>    Windows. Adding a ``cloexec`` argument to ``os.urandom()`` would
>>    not make sense. See `Always set close-on-exec flag`_ for an
>>    incomplete list of functions creating file descriptors.
>>  * Checking if a module creates file descriptors is difficult. For
>>    example, ``os.urandom()`` creates a file descriptor on UNIX to read
>>    ``/dev/urandom`` (and closes it at exit), whereas it is implemented
>>    using a function call on Windows. It is not possible to control
>>    close-on-exec flag of the file descriptor used by ``os.urandom()``,
>>    because ``os.urandom()`` API does not allow it.
>
> I think that the rule of thumb should be simple:
> if a library opens a file descriptor which is not exposed to the user,
> either because it's opened and closed before returning (e.g.
> os.urandom()) or the file descriptor is kept private (e.g. poll(), it
> should be set close-on-exec.

Ok, it looks fair. I agree *but* only if the file descriptor is closed
when the function does exit.

select.epoll() doesn't return directly a file descriptor, but an
object having a fileno() method. A server may rely on the inherance of
this file descriptor, we cannot set close-on-exec flag in this case
(if the default is False).

It becomes unclear if a function returns an opaque object which
contains a file descriptor, but the file descriptor is not accessible.
I don't know if Python contains such function.

>> Always set close-on-exec flag on new file descriptors created by
>> Python. This alternative just changes the default value of the new
>> ``cloexec`` argument.
>
> In a perfect world, all FDS should have been cloexec by default.
> But it's too late now, I don't think we can change the default, it's
> going to break some applications, and would be a huge deviation from
> POSIX (however broken this design decision is).

I disagree, according to all emails exchanged: only a very few
applications rely on file descriptor inherance. Most of them are
already using subprocess with pass_fds, and it should be easy to fix
the last applications.

I'm not sure that it's so different than the change on subprocess
(close_fds=True by default on UNIX since Python 3.2). Do you think
that it would break more applications?

I disagree but I'm waiting other votes :-)

>> Add a function to set close-on-exec flag by default
>> ---------------------------------------------------
>>
>> ...
>>  * It is not more possible to know if the close-on-exec flag will be
>>    set or not just by reading the source code.
>
> That's really a show stopper:
>
> """
> s = socket.socket()
> if os.fork() == 0:
>     # child
>     os.execve(['myprog', 'arg1])
> else:
>     # parent
>     [...]
> """
>
> It would be impossible to now if the socket is inherited, because the
> behavior of the all program is affected by an - hidden - global
> variable. That's just too wrong.

For most file descriptors, the application doesn't care on inherance.
For the few file descriptors that must be inherited, or not inherited,
you can use the explicit flag:

s = socket.socket(cloexec=True)

The idea of a global option is to not slow down programs not using
exec() nor fork() at all. It would also allow users to not have to
"fix" their applications if their applications are not "cloexec
compliant".

By the way, I don't know how this parameter will be specific in the
code if the code needs to be compatible with Python < 3.4. With a
check on the Python version?

> Also, it has the same drawbacks as global variables: not thread-safe,
> not library-safe (i.e. if two libraries set it to conflicting values,
> you don't know which one is picked up).

What do you mean by "thread-safe"?

"library-safe": only applications "should" use
sys.setdefaultcloexec(). If a library uses this function, I guess that
it would *enable* close-on-exec. If two libraries enable it, it works.
If they disagree... something is wrong :-)

But I agree that it's not the best solution. Setting cloexec to True
(or False) by default is simpler ;-)

> An atfork() module would indeed be really useful, but I don't think it
> should be used for closing file descriptors: file descriptors are a
> scarce resource, ...

I agree but Antoine proposed something like that, and I would like to
list all proposed alternatives. If I don't, someone will ask why the
"atfork" solution was not taken :-)

I will add your argument to explain why this alternative was not chosen.

Victor


More information about the Python-Dev mailing list