[Python-bugs-list] [ python-Bugs-405831 ] popen3 read fails in blocks

Mon, 12 Mar 2001 06:29:58 -0800

Bugs #405831, was updated on 2001-03-04 06:17
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=405831&group_id=5470

Category: Python Library
Group: Not a Bug
Status: Closed
Priority: 1
Submitted By: Luke Kenneth Casson Leighton
Assigned to: Guido van Rossum
Summary: popen3 read fails in blocks

Initial Comment:
bash$ python2
from popen2 import popen3
r,w,e = popen3("bash")
w.write("ls")
w.flush()
print select([r],[],[],0)
[r],[],[]
t = r.read(50)
print t
"file1
file2
"
print select([r],[],[],0)
[],[],[]

r,w,e = popen3("bash")
w.write("ls")
w.close()
t = r.read(50)
print t
"file1
file2
[50 bytes of file listings]
"
t = r.read(50)
print t
"file3
file4
[50 more bytes of file listings]
"

print select([r],[],[],0)
[r],[],[]

what is going on????

when you do a w.flush(), the read file descriptor
can get the first 50 bytes, and then select
will *always* return [],[],[] - EVEN if there's
really more data pending to be read.

it is as if there is double-buffering being
done in os.popen3() which select() is failing
to access.

this is standard-compiled version of python 2.0
on linux mandrake 7.1.  if it makes any difference.

luke

----------------------------------------------------------------------

Comment By: Guido van Rossum
Date: 2001-03-12 06:29

Message:
Logged In: YES 
user_id=6380

It's all in the docs.  open() has an opetional extra
parameter
giving the buffer size, setting it to 0 makes it unbuffered,
and that's what you want.

Now, I don't know why popen3 doesn't make its file object
unbuffered, and that would be a good thing to research in
the source code and then submit a new bug report for.

----------------------------------------------------------------------

Comment By: Luke Kenneth Casson Leighton
Date: 2001-03-12 03:11

Message:
Logged In: YES 
user_id=80200

okay.  just performed the following test:

from select import select
import sys
#f = open("foo.txt")
f = sys.stdin
while 1:
    ([r],w,e) = select([f],[],[],0)
    if r:
        z = r.read(50)
        if z:
            print z,

which if you run as python test.py < anyfilename.txt

works absolutely fine!

so, what is it about the file object returned from os.popen3
that makes it different from sys.stdin, that makes doing
read / select impossible on the popen3 file object but
os.read / select on the _file descriptor_ of the popen3 file
object okay?

sorry for being persistent about this :)

----------------------------------------------------------------------

Comment By: Luke Kenneth Casson Leighton
Date: 2001-03-12 02:58

Message:
Logged In: YES 
user_id=80200

hiya guido,

okay.  i understand.  thought about this: to make it "work",
you would have to either switch off the read-ahead buffering
in cases where a number of bytes to be read is specified.

or you would have to emulate select. which would cause other
problems because you can always use the file object's
fileno() method to obtain the file descriptor, directly.

... which... makes... it... uh... difficult.

proposal: can there be a function added to file object,
which can be called to disable the buffering?

... but wait, surely i'm not the only person to have been
caught out by this?

surely there are other programmers who perform read / select
loops on file objects, and this cannot just be limited to
the file objects returned by popen3?

----------------------------------------------------------------------

Comment By: Guido van Rossum
Date: 2001-03-10 11:28

Message:
Logged In: YES 
user_id=6380

Not a bug -- you just can't do what you want there.

The data is being buffered in the file object (which is a
wrapper around a stdio file), and this makes it invisible
for select.

----------------------------------------------------------------------

Comment By: Luke Kenneth Casson Leighton
Date: 2001-03-09 03:34

Message:
Logged In: YES 
user_id=80200

the following test DOES work as expected, which is good news
because i can use this in my [current] project.  the
work-around is to bypass r.read() and use
os.read(r.fileno(), 50), like so:

import os
from select import select

w,r,e = os.popen3("bash")
w.write("ls -al\n")
w.flush()

while select([r],[],[],1) == ([r],[],[]):
    print os.read(r.fileno(), 50),

this would indicate that there is double-buffering or
similar in the r.read() function.  at a guess, it is
something to do with the way that read has to work in
allowing r.read() as well as r.read(50).  so, select doesn't
work because you've already _read_ all the outstanding data,
and stored it in the r object!

which means that really, the r object must support the
select() method correctly, which at the moment it does not.

with this slightly confused reasoning, without having delved
into the code, i'm sure you can work this out.  i have
enough to go on, now :)

luke

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=405831&group_id=5470