[XML-SIG] problems with encoding and SAX

Daniel Clerc clerc at uni-bremen.de
Thu Feb 9 14:09:23 CET 2006

Hi everybody!

I have some trouble with SAX and encondings...

When I try to parse the following XML-code:

<?xml version="1.0" encoding="WINDOWS-1252" ?>
<TRANSACTION TIME="03.04.2003 01:52:15" TIME_CODED="37714.0779513889"

I get this error message.

  File "C:\Python24\Lib\site-packages\_xmlplus\sax\handler.py", line
38, in fatalError
    raise exception
SAXParseException: xml_temp.xml:3766:13: not well-formed (invalid token)

Here you can find the python-code I use:


Maybe the encoding of the content between the xml-elements is
mismatching from the encoding specified. As I have to parse quite a
lot of log files (~1GB zipped), and there are only a handful of such
errors I would be very happy when I could find a way to tell sax just
not to worry and write the string anyway.

Parsing the xml-code with the MS-XML-DOM, or a JAVA-based parser is
not a problem, but I would prefer a solution in Python.



More information about the XML-SIG mailing list