[Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

Bob Ippolito bob at redivi.com
Fri Feb 17 10:50:15 CET 2006


On Feb 16, 2006, at 9:20 PM, Josiah Carlson wrote:

>
> Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>>
>> Josiah Carlson wrote:
>>
>>> They may not be encodings of _unicode_ data,
>>
>> But if they're not encodings of unicode data, what
>> business do they have being available through
>> someunicodestring.encode(...)?
>
> I had always presumed that bytes objects are going to be able to be a
> source for encode AND decode, like current non-unicode strings are  
> able
> to be today.  In that sense, if I have a bytes object which is an
> encoding of rot13, hex, uu, etc., or I have a bytes object which I  
> would
> like to be in one of those encodings, I should be able to do  
> b.encode(...)
> or b.decode(...), given that 'b' is a bytes object.
>
> Are 'encodings' going to become a mechanism to encode and decode
> _unicode_ strings, rather than a mechanism to encode and decode _text
> and data_ strings?  That would seem like a backwards step to me, as  
> the
> email package would need to package their own base-64 encode/decode  
> API
> and implementation, and similarly for any other package which uses any
> one of the encodings already available.

It would be VERY useful to separate the two concepts.  bytes<->bytes  
transforms should be one function pair, and bytes<->text transforms  
should be another.  The current situation is totally insane:
	
	str.decode(codec) -> str or unicode or UnicodeDecodeError or  
ZlibError or TypeError.. who knows what else
	str.encode(codec) -> str or unicode or UnicodeDecodeError or  
TypeError... probably other exceptions

Granted, unicode.encode(codec) and unicode.decode(codec) are actually  
somewhat sane in that the return type is always a str and the  
exceptions are either UnicodeEncodeError or UnicodeDecodeError.

I think that rot13 is the only conceptually text<->text transform  
(though the current implementation is really bytes<->bytes),  
everything else is either bytes<->text or bytes<->bytes.

-bob



More information about the Python-Dev mailing list