readline() blocks after select() says there's data??

Donn Cave donn at drizzle.com
Sat Mar 16 00:45:57 EST 2002


Quoth wealthychef at mac.com (wealthychef):
...
| Here's what I mean.  If I do a select.select() and it says, "there's
| data," then I do a readline() on the file object, but select() was
| talking about 2 lines of data, I will lose a line of data, because my
| next select() will say "nothing left."  Is that right?  Because the
| call to readline() somehow scarfs up all the data available and saves
| it for my next call to readline()?  If that's so, then readline()
| sucks, because I should have a way to query it to avoid blocking. 
| Obviously select() and readline() just don't play ball together very
| well...

That's right.  I hesitate to say that you would "lose" input lines
that way, but I suppose something you don't know you have is as good
as lost.  Indeed, there should be a way to query the C stdio buffer.

|> The point with file descriptor is to use system I/O functions on the
|> device, and avoid buffered C I/O.  Basically because select is a
|> system I/O function.  If there were a C analogue to select, then you
|> could use it with C buffered file objects, but there is no such thing.
|> 
|> When select tells you "this thing is ready to read", it means the
|> device is ready for a system level read(), as in os.read(fd, bufsize).
|> So do that, and you'll get what select was telling you about.  It's
|> really simple.  The rules are the same, if you get an empty string
|> it's at "end of file" (the pipe closed.)
|
| I just want to be sure I understand this.  You're saying that
| os.read() somehow get what select.select() was talking about, but that
| f.fromchild.readline() will read less than that?

It might return less than that, yes.  Don't say "somehow" - the
reasons for this are explained above with crystalline clarity.
The system functions like select() and os.read() are as simple as
they could be.  The C stdio functions that Python's file object uses
complicate things by adding a buffer so they can do things like
readline() with reasonable efficiency.

|> I have lost track of what kind of devices we're actually talking about -
|> I'm seeing the word "socket", but then what looks like popen2.Popen3.
|> Note that in Python, sockets are are normally socket objects, with recv()
|> methods etc., but these are unbuffered and recv() is like os.read().
|> Pipes (as created by Popen3) are either integer file descriptors or
|> file objects, there isn't any special system level pipe object.
|> As long as you're on UNIX, there's a certain purity of abstraction here
|> that you can exploit if you want it just to simplify matters - you can
|> get the file descriptor with s.fileno(), and you can use os.read() with
|> that - or you can use sock.recv() if you prefer.
|
| The objects from my original post are popen2.Popen3 objects, as you
| noticed.  The Python lib manual says that os.read() should not be used
| with file objects, even though I can get a file descriptor from them
| with fileno().  Will it work?

Sure.  Look at Popen3 (it's implemented in Python.)  It's working
with file descriptors to start with, then creates file objects from
them to return to the caller.  You're just getting back the file
descriptors.  Don't use them alternately as file objects and then
do system I/O on the file descriptors, that would be foolish for
reasons that you can easily appreciate.

I believe you don't have to worry about the file object going out of
scope and getting closed while you're ignoring it.  That would close
your file descriptor too, but it won't happen because the popen2 module
holds a reference to the Popen3 instance.  If you ever have that problem,
you can dup() a new file descriptor that will survive the file object.

	Donn Cave, donn at u.washington.edu



More information about the Python-list mailing list