Quickest marshal.loads from unicode?

Alex Martelli aleax at aleax.it
Fri Feb 21 16:42:41 EST 2003


Giles Brown wrote:

> giles_brown at hotmail.com (Giles Brown) wrote in message
> news:<57de9986.0302210118.4288f436 at posting.google.com>...
>> Problem
>> -------
>> How can the I take the result of a marshal.dumps call and decode
>> (encode?) it
>> into a unicode string so that the conversion back to a marshal.loads
>> compatible string is as quick as possible?
> 
> Alex's post has prompted me to give a more accurate description of
> my problem, which is that with encoding of unicode, for instance,
> utf-8 cannot cope with the binary data that you get in a marshalled peice
> of code.  Is there an encoding that will always work
> for any binary data "string" that a marshal.dumps call might generate?

All iso-8859-* codecs have this property, and I think latin-1 is 
just a fast implementation of iso-8859-1.  utf-16 would give you
shorter Unicode strings, but it would need the string to have an
even length to start with, and I don't think marshal.dumps can
ensure that, so you'd have to pad and unpad -- unless there is
a serious bottleneck saving and recovering your Unicode strings
that depends on their length, the padding and unpadding may well
eat any gain that halving the Unicode strings' length would give.


Alex





More information about the Python-list mailing list