Tuning a select() loop for os.popen3()

Fri Dec 30 14:27:16 EST 2005

Hi all... 

I've written a class to provide an interface to popen; I've included
the actual select() loop below.  I'm finding that "sometimes" popen'd
processes take "a really long time" to complete and "other times" I
get incomplete stdout.

E.g:

  - on boxA ffmpeg returns in ~25s; on boxB (comparable hardware,
  identical OS) ~5m.

  - ``ls'' on a directory with 15 nodes returns full stdout; ``ls -R''
  on that same directory (with ~32K nodes beneath) stops after
  4097KB of output.

The code in question is running on Linux 2.6.x; no cross-platform
portability desired.  popen'd commands will never be interactive; I
just wanna read stdin/stdout and perhaps feed a one-shot string via
stdin.

Here's the relevent code (stripped of comments and various OO
setup/output stuff):

# # ## ### ##### ######## ############# #####################
# cut here

  def run(self):
    import os, select, syslog
    (_stdin, _stdout, _stderr) = os.popen3(self.command)

    stdoutChunks = []; stderrChunks = []
    readList = [_stdout, _stderr];
    if self.stdinString is not "": writeList = [_stdin]
    else: writeList = []
    readStderr = False; readStdout = False

    i = 0
    while True:
      i += 1
      (r, w, x) = select.select(readList, writeList, [], 1)
      read = ""

      if self.stdinString is not "":
        if w:
          bytesWritten = os.write(_stdin.fileno(), self.stdinString)
          writeList.remove(_stdin)
          _stdin.close()
          continue

      if r:
        if _stderr in r:
          readStderr = True
          read = os.read(_stderr.fileno(), 16384)
          if read: stderrChunks.append(read)
          else:    readList.remove(_stderr)
          continue

        elif _stdout in r:
          readStdout = True
          read = os.read(_stdout.fileno(), 16384)
          if read:
            stdoutChunks.append(read)
            syslog.syslog("Command instance read %d from stdout" % len(read))
          else:    readList.remove(_stdout)
          continue

      else:
        if \
               (readStderr and self.dieOnStderr) \
               or \
               readStdout:
          syslog.syslog("Command instance finished")
          break
    return

# cut here
# # ## ### ##### ######## ############# #####################

Tweaking (a) the os.read() buffer size and (b) the select() timeout
and testing with ``ls -R'' on a directory with ~ 32K nodes beneath, I
find the following trends:

1.  With a very small os.read() buffer, I get full stdout, but running
time is rather long.  Running time increases as select() timeout
increases.

2.  With a very large os.read() buffer, I get incomplete stdout (but
running time is *very* fast).  As select() timeout increases, I get
better and better results - with a select() timeout of 0.2 I seem to
get reliably full stdout.

The values used in the code I've pasted above - large buffer, large
select() timeout - seem to perform "well enough"; none of the
previously described problems manifest.  However, ``ls -lR /'' (way
more than 32K nodes) "sometimes" gives incomplete stdout.

My first question, then, is paranoid: I've run all these benchmarks
because the application using this code saw a HUGE performance hit
when we started using popen'd commands which generated "lots of"
output.

Is there anything wrong with the logic in my code?!

Will I see severe performance degradation (or worse, incomplete
stdout/stderr) as system variables change (e.g. system load increases,
popen'd program changes, popen'd program increases workload, etc.)?

Next question - how do I tune the select() timeout and the os.read()
buffer correctly?  Is it *really* per- command, per- system, per-
phase-of-moon voodoo?  Is there a Reccommended Setup for such a
select() loop?

Thanks in advance, for insight as well as for tolerating my
long-windedness...

-- 
Christopher DeMarco <cmd at alephant.net>
Alephant Systems (http://alephant.net)
PGP public key at http://pgp.alephant.net
+1-412-708-9660
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: Digital signature
URL: <http://mail.python.org/pipermail/python-list/attachments/20051230/2a287415/attachment.sig>