Quickest marshal.loads from unicode?

Fri Feb 21 13:19:12 EST 2003

giles_brown at hotmail.com (Giles Brown) wrote in message news:<57de9986.0302210118.4288f436 at posting.google.com>...
> Problem
> -------
> How can the I take the result of a marshal.dumps call and decode
> (encode?) it
> into a unicode string so that the conversion back to a marshal.loads
> compatible string is as quick as possible?

Alex's post has prompted me to give a more accurate description of
my problem, which is that with encoding of unicode, for instance,
utf-8 cannot cope with the binary data that you get in a marshalled peice
of code.  Is there an encoding that will always work
for any binary data "string" that a marshal.dumps call might generate?

For example, in the following code is it just fluke that latin-1 et 
al appear to work? If it is how can I acheive the desired affect.

names = { 'a' : 3.0, 'b' : 4.5 }

for expression in ["-2.0", "abs(-2.0)"]:
    for codec in 'utf-8', 'latin-1', 'iso-8859-1', 'raw-unicode-escape':
        try:
            code = compile(expression, "<string>", "eval")
            dump = marshal.dumps(code)
            decoded = dump.decode(codec)
            encoded = decoded.encode(codec)
            rebuilt = marshal.loads(encoded)
            result = eval(rebuilt, names, {})
            print "%s, %s -> %s" % (codec, expression, result)
        except Exception, e:
            print "%s, %s -> %s" % (codec, expression, e)

Produces the following output:

utf-8, -2.0 -> -2.0
latin-1, -2.0 -> -2.0
iso-8859-1, -2.0 -> -2.0
raw-unicode-escape, -2.0 -> -2.0
utf-8, abs(-2.0) -> UTF-8 decoding error: unexpected code byte
latin-1, abs(-2.0) -> 2.0
iso-8859-1, abs(-2.0) -> 2.0
raw-unicode-escape, abs(-2.0) -> 2.0

Sorry for missing the target on first asking.
Giles