[Tutor] Writing to XML file with minidom

Danny Yoo dyoo at hkn.eecs.berkeley.edu
Wed Aug 31 21:00:31 CEST 2005


> > One snag that I found is that the des encryption that I used for the
> > data that is written back, it is not parsed correctly when the file is
> > read again with the new data in it. There is non-printable characters
> > or non-ascii chars in that gives errors from expat when the contents
> > is parsed. I had to use a different encryption algorithm. I am going
> > to do some tests on it now.
>
> Put the cyphertext in a CDATA section, so the parser knows to ignore
> its contents:
>
> <?xml version="1.0"?>
> <root>
>     <cyphertext><![CDATA[
>         ^KI^[?+?6?
>     ]]>
>     </cyphertext>
> </root>


Hi Travis,


Putting pure binary bytes in an XML file has a flaw: the issue is that the
binary bytes themselves might contain characters that could be interpreted
as XML!  Even if we wrap the content in CDATA, there's nothing that really
stops the bytes from containing the characters "]]>" to prematurely close
off the CDATA tag.

To get around this, we can use a technique called "ascii-armor" to wrap
protection around the troublesome binary text.

    http://en.wikipedia.org/wiki/ASCII_armor

Python comes with a common ascii-armoring algorithm called "base64", and
it's actually very easy to use it.  Let's do a quick example.


Let's say we have some binary unprintable bytes, like this:

######
>>> someBytes = open('/usr/bin/ls').read(8)
>>> someBytes
'\x7fELF\x01\x01\x01\x00'
######

(Hey, look, an Elf!  *grin*)

Anyway, this ELF will probably pass through email poorly, because the
bytes surrounding it are pretty weird.  But we can apply base64 encoding
on those bytes:

######
>>> encodedBytes = someBytes.encode('base64')
>>> encodedBytes
'f0VMRgEBAQA=\n'
######

And now it's in a form that should pass cleanly through.  Decoding it is
also a fairly easy task:

######
>>> encodedBytes.decode('base64')
'\x7fELF\x01\x01\x01\x00'
######

And now we've got our ELF back.


Hope this helps!



More information about the Tutor mailing list