[Python-3000] Reversing through text files with the new IO library
Mark Russell
mark.russell at zen.co.uk
Mon Mar 12 21:10:17 CET 2007
On 12 Mar 2007, at 17:56, Guido van Rossum wrote:
> Thanks! This is a very interesting idea, I'd like to keep this
> around somehow.
Thanks for the positive feedback - much appreciated.
> I also see that you noticed a problem with text I/O in the current
> design; there's no easy way to implement readline() efficiently. I
> want readline() to be as efficient as possible -- "for line in <file>"
> should *scream*, like it does in 2.x.
Yes, I suspect that BufferedReader needs some kind of readuntil()
method, so that (at least for sane encodings like utf-8) each line is
read via a single readuntil() followed by a decode() call for the
entire line.
Maybe something like this (although the only way to be sure is to
experiment):
line, endindex = buffer.readuntil(line_endings)
Read until we see one of the byte strings in line_endings, which
is a sequence of one or
more byte strings. If there are multiple line endings with a
common prefix, use the longest.
Return the line complete with the ending, with endindex being
the index within line of the
line ending (or None if EOF was encountered).
Is anyone working on io.py btw? If not I'd be willing to put some
time into it. I guess the todo list is something like this:
- Finish off the python prototypes in io.py (using and maybe
tweaking the API spec)
- Get unit tests working with __builtin__.open = io.open
- Profile and optimize (e.g. by selective conversion to C)
Mark
More information about the Python-3000
mailing list