[Tutor] help on StringIO

Tue May 27 00:33:41 CEST 2008

> On Mon, May 26, 2008 at 12:53 PM, inhahe <inhahe at gmail.com> wrote:
>> I guess that makes sense, but it seems a lot more useful to have a
>> function that only reads what's been written since the last time 
>> you
>> read.

The file model in Python (and most programming languages)
is based on the early tape drive model. You have a magnetic
head that can read or write at the current point on the tape.
You have to rewind the tape if you want to playback what
you just recorded. If you do that, then read it you will be
ready to record the next section. Yes you do need to store
the starting point each time so that you can seek back to
it (rewind), but it all works very logically if you bear in mind
the original tape model.

>> That way reading from a file that's growing would be
>> transparent--the same as reading it as if it were already full 
>> size.

But you would need two record heads and an infinitely
long tape player (or triple loop buffer mechanism but
thats getting technical! :-). The model may be outdated
- that's why we use databases - but that is how the
model works.

>> It seems with this read function if you were copying a file, 
>> because
>> the file might be growing you'd have to do this:

Copying a file is no problem because you are reading from
one file and writing to another. It would only be a problem
if you tried to read back what you had just copied between
each write. (Sometimes we do that - its often called validated
write - and it slows down the copy process precisely
because of all the to-ing and fro-ing involved)

>> otherwise your copy process would quit as soon as any new data is
>> written, because the file position would jump to the end of the 
>> file.

The file position is always at the end of the file when writing.
You add data in sequence, that's why they are called
sequential files.

>> What i'm wondering about specifically is WSGI.  it's supposed to
>> return a file-like object, wsgi.input, for reading the http 
>> request.
>> and if you read beyond what's currently received, it's supposed to
>> block until more is received.

Which is what you'd want. It reads whatever is on the tape but
if it reaches the end it stops and waits for more data to arrive.

>> and even then i suppose the WSGI server had better not write 
>> anything
>> to the stream before it's all read up otherwise it'll overwrite to
>> some position in the middle of the stream

I know little or nothing about WSGI specifics but streams in general
will always write to the end of the buffer. The fact that the buffer
behaves like a file-like-object implies that you can read input as
if it were a file but the arriving data will continuously be dumped
at the end. I don't think that should cause a problem. Its a
difference between a data steam and a normal file. You can't
rewind a stream. It looks like WSGI has two modes operating
at once. A data stream model for receiving the data and a
file-like model for reading that data out.

>> where's the file class that works in a way that makes sense?)

Files work like a tape recorder. Once you understand the underlying
sequential model they do make sense. They just might not be they
way you would want them to be! (NB There are other file models
in computing, such as ISAM etc which work differently, but sequential
is by far the dominant model and the one used in Python)

HTH,

-- 
Alan Gauld
Author of the Learn to Program web site
http://www.freenetpages.co.uk/hp/alan.gauld