parsing an xml document with funky non-ascii characters

Jason Orendorff jason at jorendorff.com
Mon Feb 4 05:11:58 CET 2002


andrew writes:
> How do I deal with xml documents with characters like 'ä'?
>
> I have tried:
> 	- setting encoding="ISO-8859-1 in the xml doc itself

I put this in mine:

<?xml version="1.0" encoding="ISO-8859-1"?>

and it works.  It should work for you too, unless there's
a bug in the XML parser.  Note that the <? must be the first
characters of the file, though.


> 	- escaping the character in the doc: ('\x84')

This isn't a meaningful escape in XML, just 4 ordinary
text characters.  But you could do &#x84; and that might work.

## Jason Orendorff    http://www.jorendorff.com/





More information about the Python-list mailing list