Avoiding file descriptors leakage in subprocess.Popen()
In a long lived process at work, we started leaking file descriptors. The problem were that subprocess.Popen._execute_child() method creates two files descriptors through a call to os.pipe(), and after some work it closes them. But an os.read() on the descriptor was interrupted (EINTR), so an exception was raised, and the descriptors were not closed... leakage! This problem is easy to fix (I have a patch that fixes it, all tests pass ok, see http://bugs.python.org/issue6274). So, why this mail? I just think that the fix is ugly... it's not elegant. It has the following structure: errpipe_read, errpipe_write = os.pipe() try: try: ..... ..... ..... ..... ..... ..... finally: os.close(errpipe_write) ..... ..... ..... finally: os.close(errpipe_read) I just don't like a huge try/finally... but as FDs are just ints, do you think is there a better way to handle it? Thank you!! Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/
Facundo Batista wrote:
I just don't like a huge try/finally... but as FDs are just ints, do you think is there a better way to handle it?
How about a nice 'n shiny context wrapper for the pipe: import os class Pipe(object): def __enter__(self): self.read, self.write = os.pipe() return self.read, self.write def __exit__(self, *args): try: os.close(self.read) finally: # make sure that write is closed even if # self.read can't be closed os.close(self.write) with Pipe() as (read, write): print read, write Christian PS and nit pick: File descriptor are opaque resource handlers which just happened to be ints. They should be treated as magic cookies.
Facundo Batista wrote:
I just don't like a huge try/finally... but as FDs are just ints, do you think is there a better way to handle it?
How about a nice 'n shiny context wrapper for the pipe: import os class Pipe(object): def __enter__(self): self.read, self.write = os.pipe() return self.read, self.write def __exit__(self, *args): try: os.close(self.read) finally: # make sure that write is closed even if # self.read can't be closed os.close(self.write) with Pipe() as (read, write): print read, write Christian PS and nit pick: File descriptor are opaque resource handlers which just happened to be ints. They should be treated as magic cookies.
On Fri, Jun 12, 2009 at 9:06 PM, Christian Heimes<lists@cheimes.de> wrote:
How about a nice 'n shiny context wrapper for the pipe:
I'll do this! Thank you for the suggestion! BTW, as this is a good way of avoiding the FD leakage, should this context wrapper for pipe() be in the stdlib? Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/
Facundo Batista <facundobatista@gmail.com> wrote:
errpipe_read, errpipe_write = os.pipe() try: try: ..... ..... ..... ..... ..... ..... finally: os.close(errpipe_write) ..... ..... ..... finally: os.close(errpipe_read)
I just don't like a huge try/finally... but as FDs are just ints, do you think is there a better way to handle it?
I use a convenience function like this, so that GC takes care of the FDs: def make_pipe(): read_fd, write_fd = os.pipe() return os.fdopen(read_fd, "r"), os.fdopen(write_fd, "w") Mark
On 14Jun2009 16:42, Mark Seaborn <mrs@mythic-beasts.com> wrote: | I use a convenience function like this, so that GC takes care of the FDs: | | def make_pipe(): | read_fd, write_fd = os.pipe() | return os.fdopen(read_fd, "r"), os.fdopen(write_fd, "w") Not guarrenteed to be timely. The try/except at least closes things as control passes out of the relevant scope. I don't think all pythons do immediate ref-counted GC. But it's very neat! -- Cameron Simpson <cs@zip.com.au> DoD#743 http://www.cskk.ezoshosting.com/cs/ Trust the computer industry to shorten Year 2000 to Y2K. It was this thinking that caused the problem in the first place. - Mark Ovens <marko@uk.radan.com>
On Tue, 16 Jun 2009 01:20:53 pm Cameron Simpson wrote:
I don't think all pythons do immediate ref-counted GC.
Jython and IronPython don't. I don't know about PyPy, CLPython, Unladen Swallow, etc. -- Steven D'Aprano
Steven D'Aprano wrote:
On Tue, 16 Jun 2009 01:20:53 pm Cameron Simpson wrote:
I don't think all pythons do immediate ref-counted GC.
Jython and IronPython don't. I don't know about PyPy, CLPython, Unladen Swallow, etc.
PyPy doesn't, Unladen Swallow won't. Michael -- http://www.ironpythoninaction.com/
On Tue, Jun 16, 2009 at 1:21 PM, Michael Foord<fuzzyman@voidspace.org.uk> wrote:
Steven D'Aprano wrote:
On Tue, 16 Jun 2009 01:20:53 pm Cameron Simpson wrote:
I don't think all pythons do immediate ref-counted GC.
Jython and IronPython don't. I don't know about PyPy, CLPython, Unladen Swallow, etc.
PyPy doesn't, Unladen Swallow won't.
Also most Lisp implementations, thus CLPython, use a tracing GC. - Willem
Cameron Simpson <cs@zip.com.au> wrote:
On 14Jun2009 16:42, Mark Seaborn <mrs@mythic-beasts.com> wrote: | I use a convenience function like this, so that GC takes care of the FDs: | | def make_pipe(): | read_fd, write_fd = os.pipe() | return os.fdopen(read_fd, "r"), os.fdopen(write_fd, "w")
Not guarrenteed to be timely. The try/except at least closes things as control passes out of the relevant scope. I don't think all pythons do immediate ref-counted GC.
Yep. I don't mind closing FDs explicitly when it's easy to do so in a try..finally, but it's not always simple. There are two different problems with non-prompt closing of FDs: * Whether an FD has been closed is sometimes externally observable. e.g. Pipes and sockets notify the other end of the connection. Open file and directory FDs prevent filesystems from being unmounted. * FDs use up space in the process's FD table. The second problem could be dealt with by running the GC when we get EMFILE, or before any calls that allocate FDs when the FD table is almost full, just as the GC runs when we "run out" of memory. I wonder if this proposal could help: 'GC & The "Expensive Object" Problem' http://www.eros-os.org/pipermail/e-lang/1999-May/002590.html Mark
On Sat, Jun 13, 2009 at 9:40 AM, Facundo Batista<facundobatista@gmail.com> wrote:
How about a nice 'n shiny context wrapper for the pipe:
I'll do this!
Thank you for the suggestion!
Boo, I can not take this approach, neither the previous one. The reason is very specific for subprocess.py... after creating the FDs, it forks(), and the behaviour of when closing which descriptors are different for the parent and child processes. If I take Christian approach, the test just hangs. One drawback of the "with" usage is that it closes both FDs at the same time... and in this case this may be a problem, as they're used and closed by different processes in different times... don't know. I also tried the approach of wrapping the FDs with a file object... this *almost* work... but in the case of a problem in the child process, when a traceback should be written through the pipe to the parent, nothing happens (it seems that the GC just closes the object and the info is not transferred). So, I'll stick to the code I submitted to the bug, because even if it's ugly it works. -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/
participants (7)
-
Cameron Simpson
-
Christian Heimes
-
Facundo Batista
-
Mark Seaborn
-
Michael Foord
-
Steven D'Aprano
-
Willem Broekema