popen2.Popen3 and slightly large output

Donn Cave donn at drizzle.com
Fri Aug 16 06:52:23 CEST 2002

Quoth mikew at wakerly.com (mike wakerly):
[re full buffer]
| I can't see how. I've tried the obvious route -- setting the
| popen2.Popen3 bufsize to zero showed no difference. (The default of -1
| looks to be the same). I should say that of course the behaviour still
| suggests to me that _some_ buffer for _something_ is not being
| flushed, but I think it is buried much lower in the implemntation --
| where I lack the expertise.

I see Jeff Epler has weighed in with a good explanation, hope
you caught that.

| I should have rephrased that to read that no inspection was
| clue-yielding <wink>. I did indeed look at popen2 and friends a few
| times. I got so far as recognizing the input and output as file
| descriptors of pipes, but what part of this sort of connection would
| cause this? If anything, I would expect _smaller_ amounts of output to
| cause this non-flushing behaviour. I'd gathered that the underlying os
| pipe is block-buffered on my os (Linux). What I do not know is how to
| flush this, if this is indeed the buffer in question (and why my
| combination of more lines causes this, while a single character output
| will not..)

By now I'm sorry to say I've entirely forgotten the details of your
problem, but in general the popen2 family has an inherent weakness
with large outputs.  Any pipe can fill up, but when you have two or
more pipes, it's hard to write a program that's robust in the face
of large outputs.  Suppose you want to read stdout and stderr
separately.  If one of them is flooded with a big wad of output,
you have to read some of it before your child process can continue,
and if you're waiting on output from the other pipe - deadlock.
Suppose you want to read stdout and write to stdin - more of the
same kind of problems.  So you have to use select, as Jeff Epler
suggested, and maybe your own buffering.

The other vague general point that's worth mentioning is that each
process - your child process whatever it is, and your Python program -
has two kinds of buffer going on here.  There's the pipe device in
"system" space, and your C library stdio buffers for each file object
in "process" space.  One significant difference is that select() can
see the pipe buffer, but it doesn't see process space buffers on either
end.  If you issue a readline(), C may suck in all the data in the
pipe buffer, and the rest all waits there in process space for the
next readline.  Which will never come, because select() is telling
you there's no data left - in the pipe buffer.  When it starts getting
hairy, I throw away the file objects and just use the file descriptors
with posix.read and write, to avoid this kind of trouble.

	Donn Cave, donn at drizzle.com

More information about the Python-list mailing list