[Python-ideas] Adding 'bytes' as alias for 'latin_1' codec.

Ethan Furman ethan at stoneleaf.us
Wed Jun 1 19:58:12 CEST 2011


Terry Reedy wrote:
> On 6/1/2011 12:34 PM, Bill Janssen wrote:
> 
>> IMO, the thing that bit us on the fundament with the 2.x str/unicode
>> divide, and continues to bite us with the 3.x str/bytes divide is that
>> we don't carry the encoding as part of the 2.x 'str' value (or as part
>> of the 3.x 'bytes' value).  The key here is to store the encoding
>> internally in the string object, so that it's available to do automatic
>> coercion when necessary, rather than *requiring* all coercions to be
>> done manually by some program code.
> 
> Some time ago, I posted here a proposal to do just that -- add an 
> encoding field to byte strings (or, I believe, add a new class). It was 
> horribly shot down. Something like 'conceptually wrong, some bytes have 
> 0 or multiple encodings, can just use an attribute or tuple, don't need 
> it'.
> 
A byte stream with multiple encodings?  Now *that* seems wrong!

It could also be handled by having the encoding field set to some 
special value indicating Unknown.

~Ethan~



More information about the Python-ideas mailing list