[Python-3000] On PEP 3116: new I/O base classes
Bill Janssen
janssen at parc.com
Thu Jun 21 02:33:45 CEST 2007
Daniel Stutzbach wrote:
> On 6/20/07, Bill Janssen <janssen at parc.com> wrote:
> > > Ah, not everyone dealing with text is dealing with line-delimited
> > > text, you know...
> >
> > It's really the only difference between text and non-text.
>
> Text is a sequence of characters. Non-text is a sequence of bytes.
> Characters may be multi-byte. It is no longer an ASCII world.
Yes, of course, Daniel, but I was speaking of the contents of files,
and files are inherently sequences of bytes. If we are talking about
some layer which interprets the contents of a file, just saying "give
me N characters" isn't enough. We need to say, "N characters assuming
a text encoding of M, with a normalization policy of Q, and a newline
policy of R". If we don't, we can't just "read" N characters safely.
So I think it's broken to put this in the TextIOBase class; instead,
there should be some wrapper class that does buffering and can be
configured as to (M, Q, R).
Bill
More information about the Python-3000
mailing list