Proposal for a new function "open_noinherit" to avoid problems with subprocesses and security risks
I'd like to propose a new function "open_noinherit" or maybe even a new mode flag "n" for the builtin "open" (see footnote for the names). The new function should work exactly like the builtin "open", with one difference: The open file is not inherited to any child processes (whereas files opened with "open" will be inherited). The new function can be implemented (basically) using os.O_NOINHERIT on MS Windows resp. fcntl / FD_CLOEXEC on Posix. I will post a working Python implementation next week. There are five reasons for the proposal: 1) The builtin "open" causes unexpected problems in conjunction with subprocesses, in particular in multi-threaded programs. It can cause file permission errors in the subprocess or in the current process. On Microsoft Windows, some of the possible file permission errors are not documented by Microsoft (thus very few programs written for Windows will react properly). 2) Inheriting open file handles to subprocesses is a security risk. 3) For the developer, finding "cause and effect" is *very* hard, in particular in multi-threaded programs, when the errors occur only in race-conditions. 4) The problems arise in some of the standard library modules as well, i.e. shutil.filecopy. 5) Very few developers are aware of the possible problems. As a work-around, one can replace open with os.fdopen (os.open (..., + os.O_NOINHERIT), ... ) on Windows, but that's really ugly, hard to read, may raise a different exception than open (IOError instead of OSError), and needs careful work to take platform-specific code into account Here is a single-threaded example to demonstrate the effect: import os import subprocess outf = open ("blah.tmp", "wt") subprocess.Popen("notepad.exe") # or whatever program you like, but # It must be a program that does not exit immediately! # Now the subprocess has inherited the open file handle # We can still write: outf.write ("Hello world!\n") outf.close() # But we can not rename the file (at least on Windows) os.rename ("blah.tmp", "blah.txt") # this fails with OSError: [Errno 13] Permission denied # Similar problems with other file operations on non-Windows platforms. Ok, in this little program one can see what is going wrong easily. But what if the subprocess exits very quickly? Then perhaps you see the OSError, perhaps not - depending on the process scheduler of your operation system. In a commercial multi-theaded daemon application, the error only occured under heavy load and was hard to reproduce - and it was even harder to find the cause. That's because cause and effect were in two different threads in two completely different parts of the program: - Thread A opens a file and starts to write data - Thread B starts a subprocess (which inherits the file handle from thread A!) - Thread A continues writing to the file and closes it. - And now it's a race condition: - a) Thread A wants to rename the file - b) the subprocess exits. If a) is first: Error, if b) is first: no error. To make things more complicated, even two subprocesses can disturb each other. The new function should be implemented in C ideally, because the GIL could prevent a thread-switch between os.open and the fcntl.F_SETFD call. Note that the problem described here arises not only for files, but for sockets as well. See bug 1222790: SimpleXMLRPCServer does not set FD_CLOEXEC Once there is an easy-to-use, platform-independent, documented builtin "open_noinherit" (or a new mode flag for "open"), the standard library should be considered. For each occurence of "open" or "file", it should be considered if it necessary to inherit the file to subprocesses. If not, it should be replaced with open_noinherit. One example is shutil.filecopy, where open_noiherit should be used instead of open. The socket module is another candidate, I think - but I'm not sure about that. A nice effect of using "open_noinherit" is that - in many cases - one no longer needs to speficy close_fds = True when calling subprocess.Popen. [Note that close_fds is *terribly* slow if MAX_OPEN_FILES is "big", e.g. 800, see bug 1663329] Footnote: While writing this mail, at least 3 times I typed "nonherit" instead of "noinherit". So maybe someone can propose a better name? Or a new mode flag character could be "p" (like "private" or "protected"). Henning
Henning von Bargen schrieb:
I'd like to propose a new function "open_noinherit" or maybe even a new mode flag "n" for the builtin "open" (see footnote for the names).
Do you have a patch implementing that feature? I believe it's unimplementable in Python 2.x: open() is mapped to fopen(), which does not support O_NOINHERIT. If you don't want the subprocess to inherit handles, why don't you just specify close_fds=True when creating the subprocess? Regards, Martin
""" OT: Argh - my email address is visible in the posting - I am doomed! """ ----- Original Message -----
Martin v. Löwis wrote:
Do you have a patch implementing that feature? I believe it's unimplementable in Python 2.x: open() is mapped to fopen(), which does not support O_NOINHERIT.
Yes, I have a patch implemented in pure Python. I got the code on my workplace PC (now I am writing from home, that's why I said I'll post the code later). The patch uses os.fdopen ( os.open (..., ...), ...). It translates IOError into OSError then to raise the same class of exception aso open(). Unfortunately, the patch is ignoring the bufsize argument, so it is only a protoype at this time. I know that open() is mapped to fopen() and fopen does not support close_fds. Thus a correct patch has to be implemented at the C level. It should use open and fdopen instead of fopen - just like the Python prototype. AFAIK in the C stdlib implementation, fopen is implemented based on open anyway. BTW to find out what happens, I had to look to the source distribution for the first time after 3 years of using Python.
If you don't want the subprocess to inherit handles, why don't you just specify close_fds=True when creating the subprocess?
The subprocess module is a great piece of code, but it has its weeknesses. "close_fds" is one of them. subprocess.py fails on MS Windows if I specify close_fds. And it *cannot* be fixed for MS Windows in the subprocess module. This is due to the different way MS Windows handles handles :-) in child process creation: In Posix, you can just work through the file numbers range and close the ones you don't want/need in the subprocess. This is how close_fds works internally. It closes the fds starting from 3 to MAX_FDs-1, thus only stdin, stdout and stderr are inherited. On MS Windows, AFAIK (correct me if I am wrong), you can only choose either to inherit handles or not *as a whole* - including stdin, stdout and stderr -, when calling CreateProcess. Each handle has a security attribute that specifies whether the handle should be inherited or not - but this has to be specified when creating the handle (in the Windows CreateFile API internally). Thus, on MS Windows, you can either choose to inherit all files opened with "open" + [stdin, stdout, stderr], or to not inherit any files (meaning even stdin, stdout and stderr will not be inherited). In a platform-independent program, close_fds is therefore not an option. Henning
Yes, I have a patch implemented in pure Python.
I got the code on my workplace PC (now I am writing from home, that's why I said I'll post the code later).
The patch uses os.fdopen ( os.open (..., ...), ...). It translates IOError into OSError then to raise the same class of exception aso open().
Hmm. I don't think I could accept such an implementation (whether in Python or in C). That's very hackish.
AFAIK in the C stdlib implementation, fopen is implemented based on open anyway.
Sure - and in turn, open is implemented on CreateFile. However, I don't think I would like to see an fopen implementation in Python. Python 3 will drop stdio entirely; for 2.x, I'd be cautious to change things because that may break other things in an unexpected manner.
On MS Windows, AFAIK (correct me if I am wrong), you can only choose either to inherit handles or not *as a whole* - including stdin, stdout and stderr -, when calling CreateProcess.
I'm not sure. In general, that seems to be true. However, according to the ReactOS sources at http://www.reactos.org/generated/doxygen/dd/dda/dll_2win32_2kernel32_2proces... Windows will duplicate stdin,stdout,stderr from the parent process even if bInheritHandles is false, provided that no handles are specified in the startupinfo, and provided that the program to be started is a console (CUI) program.
Each handle has a security attribute that specifies whether the handle should be inherited or not - but this has to be specified when creating the handle (in the Windows CreateFile API internally).
Not necessarily. You can turn on the flag later, through SetHandleInformation.
Thus, on MS Windows, you can either choose to inherit all files opened with "open" + [stdin, stdout, stderr], or to not inherit any files (meaning even stdin, stdout and stderr will not be inherited).
In a platform-independent program, close_fds is therefore not an option.
... assuming you care about whether stdin,stdout,stderr are inherited to GUI programs. If the child process makes no use of stdin/stdout, you can safely set close_fds to true. Regards, Martin
"Martin v. Löwis" wrote:
Yes, I have a patch implemented in pure Python.
I got the code on my workplace PC (now I am writing from home, that's why I said I'll post the code later).
The patch uses os.fdopen ( os.open (..., ...), ...). It translates IOError into OSError then to raise the same class of exception aso open().
Hmm. I don't think I could accept such an implementation (whether in Python or in C). That's very hackish.
Well, if this is your opinion... Take a look at the fopen implementation in stdio's fopen.c: # I found it via Google Code Search in the directory src/libc/ansi/stdio/fopen.c # of http://clio.rice.edu/djgpp/win2k/djsr204_alpha.zip FILE * fopen(const char *file, const char *mode) { FILE *f; int fd, rw, oflags = 0; ... fd = open(file, oflags, 0666); if (fd < 0) return NULL; f->_cnt = 0; f->_file = fd; f->_bufsiz = 0; ... } ... return f; } As you can see, at the C level, basically "fopen" is "open" with a little code around it to parse flags etc. It's the same kind of hackish code. And (apart from the ignored bufsize argument) the prototype is working fine. I have to admit, though, that I am only using it on regular files. Anyway, I don't want to argue about the implementation of a patch. The fact is that until now the python programmer does not have an easy, platform-independent option to open files non-inheritable. As you mentioned yourself, the only way to work around it in a platform-independent manner, IS VERY HACKISH. So, shouldn't this hackish-ness better be hidden in the library instead of leaving it as an execise to the common programmer? The kind of errors I mentioned ("permission denied" errors that seem to occur without an obvious reason) have cost me at least two weeks of debugging the hard way (with ProcessExplorer etc) and caused my manager to loose his trust in Python at all... I think it is well worth the effort to keep this trouble away from the Python programmers if possible. And throughout the standard library modules, "open" is used, causing these problems as soon as sub-processes come into play. Apart from shutil.copyfile, other examples of using open that can cause trouble are in socket.py (tell me any good reason why socket handles should be inherited to child processes) and even in logging.py. For example, I used RotatingFileHandler for logging my daemon program activity. Sometimes, the logging itself caused errors, when a still-running child process had inherited the log file handle and log rotation occured.
AFAIK in the C stdlib implementation, fopen is implemented based on open anyway.
Sure - and in turn, open is implemented on CreateFile. However, I don't think I would like to see an fopen implementation in Python. Python 3 will drop stdio entirely; for 2.x, I'd be cautious to change things because that may break other things in an unexpected manner.
Yeah, if you think it should not be included in 2.x, then the handle inheritance problem should at least be considered in the PEPs [(3116, "New I/O"), (337, "Logging Usage in the Standard Modules")]
On MS Windows, AFAIK (correct me if I am wrong), you can only choose either to inherit handles or not *as a whole* - including stdin, stdout and stderr -, when calling CreateProcess.
I'm not sure. In general, that seems to be true. However, according to the ReactOS sources at
http://www.reactos.org/generated/doxygen/dd/dda/dll_2win32_2kernel32_2proces...
Windows will duplicate stdin,stdout,stderr from the parent process even if bInheritHandles is false, provided that no handles are specified in the startupinfo, and provided that the program to be started is a console (CUI) program.
Each handle has a security attribute that specifies whether the handle should be inherited or not - but this has to be specified when creating the handle (in the Windows CreateFile API internally).
Not necessarily. You can turn on the flag later, through SetHandleInformation.
So do you think that a working "close_fds" could be implemented for Windows as well? Explicitly turning off the inheritance flag for all child handles except stdin, stdout and stderr in subprocess / popen (the equivalent to what close_fds does for Posix) - that's what I call hackish. And I doubt that it is possible at all, for two reasons: - you have to KNOW all the handles. - due to the different process creation in Windows (there's no fork), you had to set the inheritance flags afterwards - all this is not thread-safe.
Thus, on MS Windows, you can either choose to inherit all files opened with "open" + [stdin, stdout, stderr], or to not inherit any files (meaning even stdin, stdout and stderr will not be inherited).
In a platform-independent program, close_fds is therefore not an option.
... assuming you care about whether stdin,stdout,stderr are inherited to GUI programs. If the child process makes no use of stdin/stdout, you can safely set close_fds to true.
Hmm... In the bug 1663329 I posted ("subprocess/popen close_fds perform poor if SC_OPEN_MAX is hi"), you suggested: """ - you should set the FD_CLOEXEC flag on all file descriptors you don't want to be inherited, using fnctl(fd, F_SETFD, 1) """ Apart from the fact that this is not possible on MS Windows, it won't solve the problem! (Because then I couldn't use all those standard modules that use open *without* FD_CLOEXEC). The fact is that the combination ("multi-threading", "subprocess creation", "standard modules") simply *does not work* flawlessly and produces errors that are hard to understand. And probably most progammers are not even aware of the problem. That's the main reason why I posted here. And, in my experience, programs tend to get more complex and in the future I expect to see more multi-threaded Python-programs. So the problem will not vanish - we will see it more often than we like... Regards, Henning
The kind of errors I mentioned ("permission denied" errors that
seem to occur without an obvious reason) have cost me at least two weeks of debugging the hard way (with ProcessExplorer etc) and caused my manager to loose his trust in Python at all... I think it is well worth the effort to keep this trouble away from the Python programmers if possible.
And throughout the standard library modules, "open" is used, causing these problems as soon as sub-processes come into play.
Apart from shutil.copyfile, other examples of using open that can cause trouble are in socket.py (tell me any good reason why socket handles should be inherited to child processes) and even in logging.py.
For example, I used RotatingFileHandler for logging my daemon program activity. Sometimes, the logging itself caused errors, when a still-running child process had inherited the log file handle and log rotation occured.
I just wanted to express to the group at large that these experiences aren't just Henning's; we spent a *tremendous* amount of time and effort debugging serious problems that arose from file handles getting shared to subprocesses where it wasn't really expected. Specifically, the RotatingFileHandler example above. It blatantly just breaks when subprocesses are used and its an extremely obtuse process to discover why. It was very costly to the company because it came up at a bad time and was *so* obtuse of an error. At first it looked like some sort of thread-safety problem, so a lot of prying went into that before we got stumped... after all, we *knew* no other process touched that file, and the logging module (and RotatingFileHandler) claimed and looked thread-safe, so.. how could it be having a Permission Denied error when it very clearly is closing the file before rotating it? Eventually the culprit was found, but it was very painful. A couple similar issues have arisen since, and they're only slightly easier to debug once you are expecting it. But the fact that the simple and obvious features provided in the stdlib break as a result of you launching a subprocess at some point sorta sucks :) So, yeah. Anything even remotely or vaguely approaching Henning's patch would be really, really appreciated. --SH
Stephen, thank you for speaking it out loud on python-dev. And you know better english words like "tremendous" and "obtuse" (whatever that means:-) that express what a PITA this really is. When I said it took me two weeks, that's actually not the truth. It was even more. The first problem was with RotatingLogHandler, and just like you, I first thought it was a threading problem. Thus I wrote my own version of RotationLogHandler, which builds new log file name with a timestamp instead of renaming the old log files. Actually, the point when I found out that indeed subprocesses were causing problems was when I had the program running on about 50 computers (for different clients) and for some clients the program would run very well, while for other clients there were often errors - suddenly it came to my mind that the clients with the errors were those who used a subprocess for sending e-mail via MAPI, whereas the clients who didn't experience problems were those who used smtplib for sending e-mail (no subprocesses). And then it took me a few days to write my replacement open function and to replace each occurence of "open" with the replacement function. And then, another few days later, a client told me that the errors *still* occured (although in rare cases). At first I built a lot of tracing and debugging into the MAPI subprocess "sendmail.exe". Finally I found out that it was actually shutil.filecopy that caused the error. Of course, I hadn't searched for "open" in the whole bunch of standard modules... Henning
On Sat, Jun 23, 2007 at 08:39:38AM -0700, Stephen Hansen wrote:
I just wanted to express to the group at large that these experiences aren't just Henning's; we spent a *tremendous* amount of time and effort debugging serious problems that arose from file handles getting shared to subprocesses where it wasn't really expected.
I've also encountered this when writing programs that are SCGI servers that do a fork. SCGI is like FastCGI; the HTTP server passes requests to a local server using a custom protocol. If the fork doesn't close the SCGI server port, then Apache does nothing until the forked subprocess exits, because the subprocess is keeping the request socket open and alive. One fix is to always use subprocess.Popen and specify that close_fd=True, which wasn't difficult for me, but I can imagine that an easy way to set close-on-exec would be simpler in other cases. --amk
One fix is to always use subprocess.Popen and specify that close_fd=True, which wasn't difficult for me, but I can imagine that an easy way to set close-on-exec would be simpler in other cases.
I think the complaint is not so much about simplicity, but correctness. close_fd also closes stdin/stdout/stderr, which might be undesirable and differs from POSIX. In any case, providing a uniform set-close-on-exec looks fine to me, provided it is implementable on all interesting platforms. I'm -0 on adding "n" to open, and -1 for adding if it means to reimplement fopen. Regards, Martin
Hi, I think the complaint is not so much about simplicity, but correctness.
close_fd also closes stdin/stdout/stderr, which might be undesirable and differs from POSIX.
According to the docs, stdin/stdout and stderr are not closed ( http://docs.python.org/lib/node529.html) Matthieu
I think the complaint is not so much about simplicity, but correctness. close_fd also closes stdin/stdout/stderr, which might be undesirable and differs from POSIX.
According to the docs, stdin/stdout and stderr are not closed ( http://docs.python.org/lib/node529.html)
I don't get your point: The docs says explicitly "Unix only". Regards, Martin
As you can see, at the C level, basically "fopen" is "open" with a little code around it to parse flags etc. It's the same kind of hackish code.
"little code" is quite an understatement. In Microsoft's C library (which we would have to emulate), the argument parsing of fopen is 120 lines of code. In addition, that code changes across compiler versions (where VS 2005 adds additional error checking).
Anyway, I don't want to argue about the implementation of a patch. The fact is that until now the python programmer does not have an easy, platform-independent option to open files non-inheritable. As you mentioned yourself, the only way to work around it in a platform-independent manner, IS VERY HACKISH. So, shouldn't this hackish-ness better be hidden in the library instead of leaving it as an execise to the common programmer?
Putting it into the library is fine. However, we need to find an implementation strategy that meets the user's needs, and is still maintainable. Python 3 will offer a clean solution, deviating entirely from stdio. For 2.x, we need to find a better solution than the one you proposed.
I think it is well worth the effort to keep this trouble away from the Python programmers if possible.
I don't argue about efforts - I argue about your proposed solution.
Apart from shutil.copyfile, other examples of using open that can cause trouble are in socket.py (tell me any good reason why socket handles should be inherited to child processes) and even in logging.py.
On Unix, it is *very* common to inherit socket handles to child processes. The parent process opens the socket, and the child processes perform accept(3). This allows many processes to serve requests on the same port. In Python, SocketServer.Forking*Server rely on this precise capability.
Sure - and in turn, open is implemented on CreateFile. However, I don't think I would like to see an fopen implementation in Python. Python 3 will drop stdio entirely; for 2.x, I'd be cautious to change things because that may break other things in an unexpected manner.
Yeah, if you think it should not be included in 2.x, then the handle inheritance problem should at least be considered in the PEPs [(3116, "New I/O"), (337, "Logging Usage in the Standard Modules")]
I didn't say that a solution shouldn't be included in 2.x. I said *your* solution shouldn't be. In 3.x, your solution won't apply, sine Python won't be using stdio (so fdopen becomes irrelevant)
Each handle has a security attribute that specifies whether the handle should be inherited or not - but this has to be specified when creating the handle (in the Windows CreateFile API internally).
Not necessarily. You can turn on the flag later, through SetHandleInformation.
So do you think that a working "close_fds" could be implemented for Windows as well?
No. close_fds should have the semantics of only closing the handles for that subprocess. SetHandleInformation applies to the parent process, and *all* subprocesses. So this is different from close_fds.
Explicitly turning off the inheritance flag for all child handles except stdin, stdout and stderr in subprocess / popen (the equivalent to what close_fds does for Posix) - that's what I call hackish.
I didn't propose that, and it wouldn't be the equivalent. In POSIX, the closing occurs in the child process. This is not possible on Windows, as there is no fork().
And I doubt that it is possible at all, for two reasons: - you have to KNOW all the handles. - due to the different process creation in Windows (there's no fork), you had to set the inheritance flags afterwards - all this is not thread-safe.
All true, and I did not suggest to integrate SetHandleInformation into subprocess. I *ONLY* claimed that you can change the flag after the file was opened. With that API, it would be possible to provide cross-platform access to the close-on-exec flag. Applications interested in setting it could then set it right after opening the file.
Apart from the fact that this is not possible on MS Windows, it won't solve the problem! (Because then I couldn't use all those standard modules that use open *without* FD_CLOEXEC).
The fact is that the combination ("multi-threading", "subprocess creation", "standard modules") simply *does not work* flawlessly and produces errors that are hard to understand. And probably most progammers are not even aware of the problem. That's the main reason why I posted here.
I don't see how your proposed change solves that. If there was an "n" flag, then the modules in the standard library that open files still won't use it. Regards, Martin
""" My very personal opinion: After a sleepness night, it seems to me that this is not a Python problem (or any other programming language at all). It looks more like an OS design problem (on MS Windows as well as on Linux etc). In an ideal world, when a program asks the OS to start a child process, it should have to explicitly list the handles that should be inherited. """
Martin v. Löwis wrote:
As you can see, at the C level, basically "fopen" is "open" with a little code around it to parse flags etc. It's the same kind of hackish code.
"little code" is quite an understatement. In Microsoft's C library (which we would have to emulate), the argument parsing of fopen is 120 lines of code. In addition, that code changes across compiler versions (where VS 2005 adds additional error checking).
Hmm. Wow!
Anyway, I don't want to argue about the implementation of a patch. The fact is that until now the python programmer does not have an easy, platform-independent option to open files non-inheritable. As you mentioned yourself, the only way to work around it in a platform-independent manner, IS VERY HACKISH. So, shouldn't this hackish-ness better be hidden in the library instead of leaving it as an execise to the common programmer?
Putting it into the library is fine. However, we need to find an implementation strategy that meets the user's needs, and is still maintainable.
Python 3 will offer a clean solution, deviating entirely from stdio.
Let me point out that stdio is not the problem. The problem is handle inheritance. So I don't see how this is going to be solve in Python 3 just by not using stdio. Inheritance has to be taken into account regardless of how it is implemented on the C level. And to open a file non-inheritable should be possible in an easy and platform-independent way for the average python programmer.
For 2.x, we need to find a better solution than the one you proposed.
Stephen, perhaps you can describe the workaround you used? Maybe it is better than mine. Or anyone else?
I think it is well worth the effort to keep this trouble away from the Python programmers if possible.
I don't argue about efforts - I argue about your proposed solution.
Apart from shutil.copyfile, other examples of using open that can cause trouble are in socket.py (tell me any good reason why socket handles should be inherited to child processes) and even in logging.py.
On Unix, it is *very* common to inherit socket handles to child processes. The parent process opens the socket, and the child processes perform accept(3). This allows many processes to serve requests on the same port. In Python, SocketServer.Forking*Server rely on this precise capability.
Ahh, I see. Maybe this is why my HTTP Server sometimes seems to not react when a subprocess is running... If more than one process has a handle for the same socket, how does the OS know which process should react?
Sure - and in turn, open is implemented on CreateFile. However, I don't think I would like to see an fopen implementation in Python. Python 3 will drop stdio entirely; for 2.x, I'd be cautious to change things because that may break other things in an unexpected manner.
Yeah, if you think it should not be included in 2.x, then the handle inheritance problem should at least be considered in the PEPs [(3116, "New I/O"), (337, "Logging Usage in the Standard Modules")]
I didn't say that a solution shouldn't be included in 2.x. I said *your* solution shouldn't be. In 3.x, your solution won't apply, sine Python won't be using stdio (so fdopen becomes irrelevant).
See above - please take it into account for Python 3 then.
Each handle has a security attribute that specifies whether the handle should be inherited or not - but this has to be specified when creating the handle (in the Windows CreateFile API internally).
Not necessarily. You can turn on the flag later, through SetHandleInformation.
So do you think that a working "close_fds" could be implemented for Windows as well?
No. close_fds should have the semantics of only closing the handles for that subprocess. SetHandleInformation applies to the parent process, and *all* subprocesses. So this is different from close_fds.
Yes - that's why I doubt that could work. And according to http://support.microsoft.com/kb/190351/en-us in order to capture stdout and stderr of the child process, one has to specify bInheritHandle=TRUE in CreateProcess, with the net effect that you can only choose if either ALL handles (if not explicitly specified otherwise during handle creation) should be inherited or none of them.
Explicitly turning off the inheritance flag for all child handles except stdin, stdout and stderr in subprocess / popen (the equivalent to what close_fds does for Posix) - that's what I call hackish.
I didn't propose that, and it wouldn't be the equivalent. In POSIX, the closing occurs in the child process. This is not possible on Windows, as there is no fork().
OK - I agree it is not possible. But "avoiding handle inheritance" is what one wants to achieve when specifying close_fds, I think.
And I doubt that it is possible at all, for two reasons: - you have to KNOW all the handles. - due to the different process creation in Windows (there's no fork), you had to set the inheritance flags afterwards - all this is not thread-safe.
All true, and I did not suggest to integrate SetHandleInformation into subprocess. I *ONLY* claimed that you can change the flag after the file was opened.
With that API, it would be possible to provide cross-platform access to the close-on-exec flag. Applications interested in setting it could then set it right after opening the file.
YES - that's exactly why I proposed an open_noinherit function. It is a simple solution for a common problem - such a function, documented in the library, would tell developers that they have to be aware of the problem and that a solution exists (though the implementation is more or less hackish due to platform-specific code).
Apart from the fact that this is not possible on MS Windows, it won't solve the problem! (Because then I couldn't use all those standard modules that use open *without* FD_CLOEXEC).
The fact is that the combination ("multi-threading", "subprocess creation", "standard modules") simply *does not work* flawlessly and produces errors that are hard to understand. And probably most progammers are not even aware of the problem. That's the main reason why I posted here.
I don't see how your proposed change solves that. If there was an "n" flag, then the modules in the standard library that open files still won't use it.
That's why I said the standard library should be reviewed for unintentionally handle inheritance by the use of open. Note this is a security risk as well, see http://msdn.microsoft.com/msdnmag/issues/0300/security/ Regards, Henning
Putting it into the library is fine. However, we need to find an implementation strategy that meets the user's needs, and is still maintainable.
Python 3 will offer a clean solution, deviating entirely from stdio.
Let me point out that stdio is not the problem. The problem is handle inheritance. So I don't see how this is going to be solve in Python 3 just by not using stdio.
In Python 3, it would be possible to implement the "n" flag for open(), as we call CreateFile directly.
And to open a file non-inheritable should be possible in an easy and platform-independent way for the average python programmer.
I don't see why it is a requirement to *open* the file in non-inheritable mode. Why is not sufficient to *modify* an open file to have its handle non-inheritable in an easy and platform-independent way?
Maybe this is why my HTTP Server sometimes seems to not react when a subprocess is running... If more than one process has a handle for the same socket, how does the OS know which process should react?
The processes which don't perform accept(), recv(), or select() operations are not considered by the operating system. So if only one process does recv() (say), then this process will read the data. If multiple processes perform accept() (which is a common case), the system selects a process at random. This is desirable, as the system will then automatically split the load across processes, and the listen backlog cannot pile up: if multiple connection requests arrive at the same time, one process will do accept, and then start to process the connection. Then the second process will take the second request, and so on. If multiple processes perform recv(), the system will again chose randomly. This is mostly undesirable, and should be avoided.
With that API, it would be possible to provide cross-platform access to the close-on-exec flag. Applications interested in setting it could then set it right after opening the file.
YES - that's exactly why I proposed an open_noinherit function.
I think I missed that proposal. What would that function do? If you propose it to be similar to the open() function, I'd be skeptical. It's not possible to implement that in thread-safe way if you use SetHandleInformation/ioctl. Regards, Martin
On Jun 24, 2007, at 2:19 PM, Martin v. Löwis wrote:
I don't see why it is a requirement to *open* the file in non-inheritable mode. Why is not sufficient to *modify* an open file to have its handle non-inheritable in an easy and platform-independent way?
Threads. Consider that you may fork a process on one thread right between the calls to open() and fcntl(F_SETFD, FD_CLOEXEC) on another thread. The only way to be safe is to open the file non-inheritable to start with. Now, it is currently impossible under linux to open a file descriptor noninheritable, but they're considering adding that feature (I don't have the thread-refs on me, but it's actually from the last month). The issue is that there's a *bunch* of syscalls that open FDs: this feature would need to be added to all of them, not only "open". It's possible that it makes sense for python to provide "as good as possible" an implementation. At the least, putting the fcntl call in the same C function as open would fix programs that don't open files/ spawn processes outside of the GIL protection. But, like the kernel, this feature then ought to be provided for all APIs that create file descriptors.
With that API, it would be possible to provide cross-platform access to the close-on-exec flag. Applications interested in setting it could then set it right after opening the file.
YES - that's exactly why I proposed an open_noinherit function.
I think I missed that proposal. What would that function do?
If you propose it to be similar to the open() function, I'd be skeptical. It's not possible to implement that in thread-safe way if you use SetHandleInformation/ioctl.
Now I'm confused: are you talking about the same thread-safety situation as I described above? If so, why did you ask why it's not sufficient to modify a handle to be non-inheritable? James
I don't see why it is a requirement to *open* the file in non-inheritable mode. Why is not sufficient to *modify* an open file to have its handle non-inheritable in an easy and platform-independent way?
Threads. Consider that you may fork a process on one thread right between the calls to open() and fcntl(F_SETFD, FD_CLOEXEC) on another thread. The only way to be safe is to open the file non-inheritable to start with.
No, that's not the only safe way. The application can synchronize the threads, and prevent starting subprocesses during critical regions. Just define a subprocess_lock, and make all of your threads follow the protocol of locking that lock when either opening a new file, or creating a subprocess.
Now, it is currently impossible under linux to open a file descriptor noninheritable, but they're considering adding that feature (I don't have the thread-refs on me, but it's actually from the last month). The issue is that there's a *bunch* of syscalls that open FDs: this feature would need to be added to all of them, not only "open".
Right. That is what makes it difficult inherently on the API level.
It's possible that it makes sense for python to provide "as good as possible" an implementation. At the least, putting the fcntl call in the same C function as open would fix programs that don't open files/spawn processes outside of the GIL protection.
No, that would not work. Python releases the GIL when opening a file (and needs to do so because that may be a long-running operation).
Now I'm confused: are you talking about the same thread-safety situation as I described above?
Yes.
If so, why did you ask why it's not sufficient to modify a handle to be non-inheritable?
Because I wanted to hear what the reasons are to consider that insufficient. I would have expected "ease of use" and such things (perhaps Henning will still bring up other reasons). If thread-safety is a primary concern, then that flag should *not* be added to open(), since it cannot be implemented in a thread-safe manner in a generic way - only the application can perform the proper synchronization. As discussed, there are other handles subject to inheritance, too, and the application would have to use the modify-handle function, anyway, which means it needs to make it thread-safe through explicit locking. Regards, Martin
On Sun, Jun 24, 2007 at 10:48:30PM +0200, "Martin v. Löwis" wrote:
I don't see why it is a requirement to *open* the file in non-inheritable mode. Why is not sufficient to *modify* an open file to have its handle non-inheritable in an easy and platform-independent way?
Threads. Consider that you may fork a process on one thread right between the calls to open() and fcntl(F_SETFD, FD_CLOEXEC) on another thread. The only way to be safe is to open the file non-inheritable to start with.
No, that's not the only safe way. The application can synchronize the threads, and prevent starting subprocesses during critical regions. Just define a subprocess_lock, and make all of your threads follow the protocol of locking that lock when either opening a new file, or creating a subprocess.
The problem here is that sitting in accept() becomes a critical section. While a thread is sitting in that call, no other thread could start a subprocess. A multithreaded server which uses a 1-thread-per-request model wouldn't be possible, at least not in a reasonable amount of comprehensible code.
Now, it is currently impossible under linux to open a file descriptor noninheritable, but they're considering adding that feature (I don't have the thread-refs on me, but it's actually from the last month). The issue is that there's a *bunch* of syscalls that open FDs: this feature would need to be added to all of them, not only "open".
Right. That is what makes it difficult inherently on the API level.
LWN has had good coverage of the discussion: http://lwn.net/Articles/237722/ Ross
participants (7)
-
"Martin v. Löwis" -
A.M. Kuchling -
Henning von Bargen -
James Y Knight -
Matthieu Brucher -
Ross Cohen -
Stephen Hansen