[Python-Dev] bytes.from_hex()

"Martin v. Löwis" martin at v.loewis.de
Sun Feb 19 19:55:49 CET 2006


Stephen J. Turnbull wrote:
> BTW, what use cases do you have in mind for Unicode -> Unicode
> decoding?

I think "rot13" falls into that category: it is a transformation
on text, not on bytes.

For other "odd" cases: "base64" goes Unicode->bytes in the *decode*
direction, not in the encode direction. Some may argue that base64
is bytes, not text, but in many applications, you can combine base64
(or uuencode) with abitrary other text in a single stream. Of course,
it could be required that you go u.encode("ascii").decode("base64").

>     def encode-mime-body (string, codec-list):
>         if codec-list[0] not in charset-codec-list:
>             raise NotCharsetCodecException
>         if len (codec-list) > 1 and codec-list[-1] not in transfer-codec-list:
>             raise NotTransferCodecException
>         for codec in codec-list:
>             string = string.encode (codec)
>         return string
> 
>     mime-body = encode-mime-body ("This is a pen.",
>                                   [ 'shift_jis', 'zip', 'base64' ])

I think this is an example where you *should* use the codec API,
as designed. As that apparently requires streams for stacking (ie.
no support for codec stacking), you would have to write

def encode_mime_body(string, codec_list):
    stack = output = cStringIO.StringIO()
    for codec in reversed(codec_list):
        stack = codecs.getwriter(codec)(stack)
    stack.write(string)
    stack.reset()
    return output.getValue()

Notice that you have to start the stacking with the last codec,
and you have to keep a reference to the StringIO object where
the actual bytes end up.

Regards,
Martin

P.S. there shows some LISP through in your Python code :-)


More information about the Python-Dev mailing list