[ python-Bugs-1663329 ] subprocess/popen close_fds perform poor if SC_OPEN_MAX is hi

SourceForge.net noreply at sourceforge.net
Thu Feb 22 21:16:25 CET 2007


Bugs item #1663329, was opened at 2007-02-19 11:17
Message generated for change (Comment added) made by hvbargen
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1663329&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Performance
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: H. von Bargen (hvbargen)
Assigned to: Nobody/Anonymous (nobody)
Summary: subprocess/popen close_fds perform poor if SC_OPEN_MAX is hi

Initial Comment:
If the value of sysconf("SC_OPEN_MAX") is high
and you try to start a subprocess with subprocess.py or os.popen2 with close_fds=True, then starting the other process is very slow.
This boils down to the following code in subprocess.py:
        def _close_fds(self, but):
            for i in xrange(3, MAXFD):
                if i == but:
                    continue
                try:
                    os.close(i)
                except:
                    pass

resp. the similar code in popen2.py:
    def _run_child(self, cmd):
        if isinstance(cmd, basestring):
            cmd = ['/bin/sh', '-c', cmd]
        for i in xrange(3, MAXFD):
            try:
                os.close(i)
            except OSError:
                pass

There has been an optimization already (range has been replaced by xrange to reduce memory impact), but I think the problem is that for high values of MAXFD, usually a high percentage of the os.close statements will fail, raising an exception (which is an "expensive" operation).
It has been suggested already to add a C implementation called "rclose" or "close_range" that tries to close all FDs in a given range (min, max) without the overhead of Python exception handling.

I'd like emphasize that this is not a theoretical, but a real world problem:
We have a Python application in a production environment on Sun Solaris. Some other software running on the same server needed a high value of 260000 for SC_OPEN_MAX (set with ulimit -n XXX or in some /etc/-file (don't know which one).
Suddenly calling any other process with subprocess.Popen (..., close_fds=True) now took 14 seconds (!) instead of some microseconds.
This caused a huge performance degradation, since the subprocess itself only needs only  a few seconds.

See also:
Patches item #1607087 "popen() slow on AIX due to large FOPEN_MAX value".
This contains a fix, but only for AIX - and I think the patch does not support the "but" argument used in subprocess.py.
The correct solution should be coded in C, and should
do the same as the _close_fds routine in subprocess.py.
It could be optimized to make use of (operating-specific) system calls to close all handles from (but+1) to MAX_FD with "closefrom" or "fcntl" as proposed in the patch.


----------------------------------------------------------------------

>Comment By: H. von Bargen (hvbargen)
Date: 2007-02-22 21:16

Message:
Logged In: YES 
user_id=1008979
Originator: YES

Of course I am already closing any files as soon as possible.

I know that I could use FD_CLOEXEC. But this would require that I do it
explicitly for each descriptor that I use in my program. But this would be
a tedious work and require platform-specific coding all around the program.
And the whole bunch of python library functions (i.e. the logging module)
do not use FD_CLOEXEC as well.
Right now, more or less the only platform specific code in the program is
where I call subprocesses, and I like to keep it that way.
The same is true for the socket module. All sockets are by default
inherited to child processes.
So, the only way to prevent unwanted handles from inheriting to child
processes, is in fact to specify close_fds=True in subprocess.py.
If you think that a performance patch similar to the patch #16078087 makes
no sense, then the close_fds argument should either be marked as deprecated
or at least the documentation should mention that the implementation is
slow for large values of SC_OPEN_MAX.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2007-02-21 19:18

Message:
Logged In: YES 
user_id=21627
Originator: NO

I understand you don't want the subprocess to inherit "incorrect" file
descriptors. However, there are other ways to prevent that from happening:
- you should close file descriptors as soon as you are done with the
files
- you should set the FD_CLOEXEC flag on all file descriptors you don't
want to be inherited, using fnctl(fd, F_SETFD, 1)

I understand that there are cases where neither these strategy is not
practical, but if you follow it, the performance will be much better, as
the closing of unused file descriptor is done in the exec(2) implementation
of the operating system.


----------------------------------------------------------------------

Comment By: H. von Bargen (hvbargen)
Date: 2007-02-21 16:42

Message:
Logged In: YES 
user_id=1008979
Originator: YES

No, I have to use close_fds=True, because I don't want to have the
subprocess to inherit each and every file descriptor.
This is for two reasons:
i) Security - why should the subproces be able to work with all the parent
processes' files?
ii) Sometimes, for whatever reason, the subprocess (Oracle Reports in this
case) seems to hang. And because it inherited all of the parent's log file
handles, the paraent can not close and remove its log files correctly. This
is the reason why I stumbled about close_fds at all. BTW on MS Windows, a
similar (but not equivalent) solution was to create the log files as
non-inheritable.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2007-02-21 00:45

Message:
Logged In: YES 
user_id=21627
Originator: NO

Wouldn't it be simpler for you to just don't pass close_fds=True to popen?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1663329&group_id=5470


More information about the Python-bugs-list mailing list