[XML-SIG] checking syntax with xmllib

Brian Slesinsky bslesins@best.com
Tue, 27 Apr 1999 18:26:04 -0700 (PDT)


Hi, I tried using xmllib to check if an XML document is well-formed and
found some bugs.

If I use xmllib from Python 1.5.2, it complains about invalid characters.
However, I'm fairly sure I'm using correct UTF8 encoding (the document
contains European characters and was converted to Unicode from
ISO-8859-1). It looks like the 'illegal' regular expression in xmllib is
incorrect.

I also tried xml.parsers.xmllib from Python/XML 0.5.1, but it doesn't seem
to be doing any syntax checking at all - I tried a file with one close tag
and it didn't complain.

Here's the script I'm using to do the tests:

#!/nuvo/bin/python

import sys
from xml.parsers.xmllib import XMLParser

def check_xml(file):
    x = XMLParser()
    f = open(file)

    while 1:
        line = f.readline()
        if line=="": break
        x.feed(line)

check_xml(sys.argv[1])


- Brian Slesinsky