subprocess woes

kj no.email at please.post
Tue Sep 15 19:14:13 EDT 2009


Upon re-reading my post I realize that I left out some important
details.


In <h8oppp$qo2$1 at reader1.panix.com> kj <no.email at please.post> writes:

>I'm trying to write a function, sort_data, that takes as argument
>the path to a file, and sorts it in place, leaving the last "sentinel"
>line in its original position (i.e. at the end).

I neglected to mention that the files I intend to use this with
are huge (of the order of 1GB); this is why I want to farm the work
out to GNU's sort.

>Here's what I
>have (omitting most error-checking code):

I should have included the following in the quoted code:

from subprocess import Popen, PIPE, CalledProcessError

>def sort_data(path, sentinel='.\n'):
>    tmp_fd, tmp = tempfile.mkstemp()
>    out = os.fdopen(tmp_fd, 'wb')
>    cmd = ['/usr/local/bin/sort', '-t', '\t', '-k1,1', '-k2,2']
>    p = Popen(cmd, stdin=PIPE, stdout=out)
>    in_ = file(path, 'r')
>    while True:
>        line = in_.next()
>        if line != sentinel:
>            p.stdin.write(line)
>        else:
>            break
>    in_.close()
>    p.stdin.close()
>    retcode = p.wait()
>    if retcode != 0:
>        raise CalledProcessError(retcode, cmd)
>    out.write(sentinel)
>    out.close()
>    shutil.move(tmp, path)


>This works OK, except that it does not catch the stderr from the
>called sort process.  The problem is how to do this.  I want to to
>avoid having to create a new file just to capture this stderr
>output.  I would like instead to capture it to an in-memory buffer.
>Therefore I tried using a StringIO object as the stderr parameter
>to Popen, but this resulted in the error "StringIO instance has no
>attribute 'fileno'".

>How can I capture stderr in the scenario depicted above?

>TIA!

>kynn



More information about the Python-list mailing list