xml.parsers.expat questions

Steve Wilson swwilso1 at home.com
Tue Nov 14 13:04:38 EST 2000


I've been tinkering with the expat XML parser for the last couple of
days and have
noticed that the expat.ParseFile() method does not raise expat errors in
the same way as Parse().  I think it will raise some StandardErrors like
TypeError, MemoryError, etc... However, if the parser encounters a xml
syntax error, ParseFile() quietly continues without notifying the
calling routine.

I can create a simple xml string:

<?xml version="1.0"?>
<!-- comment to test comment handler -->
<!DOCTYPE testdocument SYSTEM "testdocument.dtd" [
<!ELEMENT car (#PCDATA)>
<!NOTATION note "bub">
<!ENTITY house "HELLO WORLD">
]>
<root attr1="first" attr2="last">
<sns:element1 xmlns:sns="MYNAMESPACE">
  List of Name Spaces
</sns:element1>
<sns:element2 xmlns:wow="ANOTHERNAMESPACE" a1="a1value1">Another
interesting thing</sns:element2>
Car, Game off 
</root>

This string has an intentional error in the NOTATION declaration.  If I
parse the string with Parse(), the parser will raise error values that
try: except: will catch.  If I instead
save the string to a file and call ParseFile(), then try: except: will
not catch the error.  Although I can still reference the ErrorCode
attribute of the parser to check for a problem.

I think I have isolated the problem in:

static PyObject *
xmlparse_ParseFile(xmlparseobject *self, PyObject *args)

of Python-2.0/Modules/pyexpat.c.

My questions are:

1.  Is it reasonable to expect ParseFile() to generate exceptional
conditions in the same manner as Parse() with respect to problems in the
xml stream?

2.  If so, who maintains the pyexpat module and how should I best
communicate my suggested fix for the problem?

Feel free to respond directly to my email:  stevew at wolfram.com

Thanks,

Steve Wilson



More information about the Python-list mailing list