elementtree.ElemenTree barfs on my Safari Cookies file
Fredrik Lundh
fredrik at pythonware.com
Sun Nov 13 04:14:11 EST 2005
skip at pobox.com wrote:
> Safari stores its cookies in XML format. Looking to try and add support for
> it to cookielib I started by first trying to parse it with Fredrik Lundh's
> elementtree package. It complained about an invalid token. Looking at the
> spot it indicated in the file, I found a non-ASCII, but (as far as I can
> tell) perfectly valid utf-8 string.
xml.dom.minidom gives the same error, so it's not a problem with
elementtree in itself.
the problematic tag contains:
'\xc3\x804\x01@`'
which decodes to
'\xc04\x01@`'
which contains chr(1), which is an invalid XML character (at least
in XML 1.0).
that apple's tools are able to generate bogus XML is a known problem; for
a discussion and some workarounds, see the "Status of XML 1.1 processing
in Python" over at the xml-sig mailing list:
http://aspn.activestate.com/ASPN/Mail/Message/xml-sig/2792071
</F>
More information about the Python-list
mailing list