[Python-3000] Reversing through text files with the new IO library

Mark Russell mark.russell at zen.co.uk
Mon Mar 12 21:10:17 CET 2007


On 12 Mar 2007, at 17:56, Guido van Rossum wrote:
> Thanks! This is a very interesting idea, I'd like to keep this  
> around somehow.

Thanks for the positive feedback - much appreciated.

> I also see that you noticed a problem with text I/O in the current
> design; there's no easy way to implement readline() efficiently. I
> want readline() to be as efficient as possible -- "for line in <file>"
> should *scream*, like it does in 2.x.

Yes, I suspect that BufferedReader needs some kind of readuntil()  
method, so that (at least for sane encodings like utf-8) each line is  
read via a single readuntil() followed by a decode() call for the  
entire line.

Maybe something like this (although the only way to be sure is to  
experiment):

     line, endindex = buffer.readuntil(line_endings)

     Read until we see one of the byte strings in line_endings, which  
is a sequence of one or
     more byte strings.  If there are multiple line endings with a  
common prefix, use the longest.
     Return the line complete with the ending, with endindex being  
the index within line of the
     line ending (or None if EOF was encountered).

Is anyone working on io.py btw?  If not I'd be willing to put some  
time into it.  I guess the todo list is something like this:

     - Finish off the python prototypes in io.py (using and maybe  
tweaking the API spec)

     - Get unit tests working with __builtin__.open = io.open

     - Profile and optimize (e.g. by selective conversion to C)

Mark



More information about the Python-3000 mailing list