[Tutor] pickle in unicode format
Danny Yoo
dyoo at hkn.eecs.berkeley.edu
Tue Apr 5 21:02:01 CEST 2005
> Are you trying to send it off to someone else as a part of an XML
> document? If you are including some byte string into an XML document,
> you can encode those bytes as base64:
>
> ######
> >>> bytes = 'Fran\xe7ois'
> >>> encodedBytes = bytes.encode('base64')
> >>> encodedBytes
> 'RnJhbudvaXM=\n'
> ######
[note: this is an example of exploratory programming with Python.]
As a followup to this: this does appear to be a standard technique for
encoding binary data in XML. Apple does this in their property list
implementation.
For example, in Apple's reference documentation on plists:
http://developer.apple.com/documentation/CoreFoundation
/Conceptual/CFPropertyLists/index.html
they use an example where they encode the following bytes:
/******/
// Fake data to stand in for a picture of John Doe.
const unsigned char pic[kNumBytesInPic] = {0x3c, 0x42, 0x81,
0xa5, 0x81, 0xa5, 0x99, 0x81, 0x42, 0x3c};
/******/
into an ASCII string. That string looks like this:
######
<data>
PEKBpYGlmYFCPA==
</data>
######
and although they don't explicitely say it out loud, we can infer that
this is a pass through a base64 encoding, because when we decode that
string through base64:
######
>>> mysteryText = " PEKBpYGlmYFCPA=="
>>> mysteryText.decode("base64")
'<B\x81\xa5\x81\xa5\x99\x81B<'
>>>
>>>
>>> [hex(ord(byte)) for byte in mysteryText.decode('base64')]
['0x3c', '0x42', '0x81', '0xa5', '0x81', '0xa5', '0x99', '0x81', '0x42',
'0x3c']
######
we get the same bytes back.
(Actually, Apple's documentation does briefly mention that they do use
base64 by default, in:
http://developer.apple.com/documentation/WebObjects/Reference/API5.2.2/com/webobjects/foundation/xml/NSXMLObjectOutput.html#setUseBase64ForBinaryData(boolean)
but that's really obscure. *grin*)
Anyway, hope that was interesting to folks!
More information about the Tutor
mailing list