[Python-Dev] bytes type discussion

Greg Ewing greg.ewing at canterbury.ac.nz
Wed Feb 15 04:34:23 CET 2006

Thomas Wouters wrote:
> The encoding of network streams or files may be
> entirely unknown beforehand, and depend on the content: a content-encoding,
> a <META EQUIV> HTML tag. Will bytes-strings get string methods for easy
> searching of content descriptors?

Seems to me this is a case where you want to be able
to change encodings in the middle of reading the stream.
You start off reading the data as ascii, and once you've
figured out the encoding, you switch to that and carry
on reading.

Are there any plans to make it possible to change the
encoding of a text file object on the fly like this?

If that would be awkward, maybe file objects themselves
shouldn't be where the decoding occurs, but decoders
should be separate objects that wrap byte streams.
Under that model,

   opentext(filename, encoding)

would be a factory function that did something like

   codecs.streamdecoder(encoding, openbinary(filename))

Having codecs be stream filters might be a good idea
anyway, since then you could use them to wrap anything
that can be treated as a stream of bytes (sockets,
some custom object in your program, etc.), you
could create pipelines of encoders and decoders, etc.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiam!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

More information about the Python-Dev mailing list