[Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

Ron Adam rrr at ronadam.com
Tue Feb 21 00:40:19 CET 2006


Bengt Richter wrote:
> On Sat, 18 Feb 2006 23:33:15 +0100, Thomas Wouters <thomas at xs4all.net> wrote:

> note what base64 really is for. It's essence is to create a _character_ sequence
> which can succeed in being encoded as ascii. The concept of base64 going str->str
> is really a mental shortcut for s_str.decode('base64').encode('ascii'), where
> 3 octets are decoded as code for 4 characters modulo padding logic.

Wouldn't it be...

    obj.encode('base64').encode('ascii')

This would probably also work...

    obj.encode('base64').decode('ascii')  ->  ascii alphabet in unicode

Where the underlying sequence might be ...

    obj -> bytes -> bytes:base64 -> base64 ascii character set

The point is to have the data in a safe to transmit form that can 
survive being encoded and decoded into different forms along the 
transmission path and still be restored at the final destination.

    base64 ascii character set -> bytes:base64 -> original bytes -> obj




* a related note, derived from this and your other post in this thread.

If the str type constructor had an encode argument like the unicode type 
does, along with a str.encoded_with attribute.  Then it might be 
possible to depreciate the .decode() and .encode() methods and remove 
them form P3k entirely or use them as data coders/decoders instead of 
char type encoders.

It could also create a clear separation between character encodings and 
data coding.  The following should give an exception.

   str(str, 'rot13'))

Rot13 isn't a character encoding, but a data coding method.

   data_str.encode('rot13')   # could be ok

But this wouldn't...

   new_str = data_str.encode('latin_1')    # could cause an exception

We'd have to use...

   new_str = str(data_str, 'latin_1')      # New string sub type...


Cheers,
    Ronald Adam



More information about the Python-Dev mailing list