[Python-ideas] Add encoding attribute to bytes

Stephen J. Turnbull stephen at xemacs.org
Fri Nov 6 05:18:11 CET 2009


MRAB writes:

 > You said "give an immutable instance of a builtin class a mutable
 > attribute slot". Why would the slot be mutable?

I think the idea is that in many cases you won't know what the
encoding is until after you've read the bytes.

But I don't really see this idea as that useful either way.  The
obvious use case for me would be in the email module.  So you read in
a message and create a bytes object, which you stash away for later
use as necessary.  The header and the body, each MIME part, each MIME
part header and payload, and so on recursively are identified as
slices of the BigBytesObject you read in at the beginning, which is
implicitly a binary blob and doesn't need an encoding (strike one).
Each header identifies the encoding (which here would have to refer
ambiguously to Content-Type or Content-Transfer-Encoding, strike two)
of the corresponding payload.  And you'll need to deal with cases
where Content-Type and Content-Transfer-Encoding are both relevant,
strike three.  You may as well keep the various layers of encoding
explicitly in email-specific objects, so use case: email strikes out.

That's only one use case, of course.  But we can see what a use case
would have to look like: you read in a bytes object, just enough to
enable you to accurately parse the rest of the stream in the same way
and tag each bytes part with an appropriate encoding.  What are they?




More information about the Python-ideas mailing list