"Martin v. Löwis"
martin at v.loewis.de
Sun Feb 19 19:55:49 CET 2006
Stephen J. Turnbull wrote:
> BTW, what use cases do you have in mind for Unicode -> Unicode
I think "rot13" falls into that category: it is a transformation
on text, not on bytes.
For other "odd" cases: "base64" goes Unicode->bytes in the *decode*
direction, not in the encode direction. Some may argue that base64
is bytes, not text, but in many applications, you can combine base64
(or uuencode) with abitrary other text in a single stream. Of course,
it could be required that you go u.encode("ascii").decode("base64").
> def encode-mime-body (string, codec-list):
> if codec-list not in charset-codec-list:
> raise NotCharsetCodecException
> if len (codec-list) > 1 and codec-list[-1] not in transfer-codec-list:
> raise NotTransferCodecException
> for codec in codec-list:
> string = string.encode (codec)
> return string
> mime-body = encode-mime-body ("This is a pen.",
> [ 'shift_jis', 'zip', 'base64' ])
I think this is an example where you *should* use the codec API,
as designed. As that apparently requires streams for stacking (ie.
no support for codec stacking), you would have to write
def encode_mime_body(string, codec_list):
stack = output = cStringIO.StringIO()
for codec in reversed(codec_list):
stack = codecs.getwriter(codec)(stack)
Notice that you have to start the stacking with the last codec,
and you have to keep a reference to the StringIO object where
the actual bytes end up.
P.S. there shows some LISP through in your Python code :-)
More information about the Python-Dev