
Andy Robinson wrote:
--- Skip Montanaro skip@mojam.com wrote:
fast/memory-intensive/clear slow/memory-conserving/not-as-clear fast/memory-conserving/fairly-muddy
Any particular reason that the readline method can't return an iterator that supports __getitem__ and buffers input? (Again, remember this is for py2k, so the potential breakage such a change might cause is a consideration, but not a showstopper.)
Why not generalize fileinput to do buffering instead?
More generally, Java has the notion of 'stackable streams' - e.g. construct a 'BufferedFile' around a 'File', maybe construct a 'Line-oriented file' around that etc. Each one takes a file-like object as an argument to the constructor. Things you might want to do:
- buffering
- international encoding conversions
- line delimiters other than CR/LF/CRLF
- read/write Python objects (i.e. use pickle/marshal)
- easy interfaces to parsers
If all goes well we'll have something like this in Python 1.6 at least for the encoding/decoding part file reading and writing. You basically take a file object and then wrap some StreamCodecs around it to get the functionality you need. Very simple and very intuitive.
This took me a couple of hours to get used to (and at the time I thought 'Yuk!' when I saw first saw four nested constructors), but gives you very precise control and a lot of versatility when handling files. It's an idiom Python does not use much but maybe it should.
I'd argue that maybe some enhancements to fileinput.py
- adding some streams to provide building blocks for
these operations - would get us the power you want and a lot more versatility besides.