[XML-SIG] broken expat module in PyXML-0.6.3

Radestock, Guenter guenter.radestock@sap.com
Thu, 15 Feb 2001 15:34:35 +0100


I have tried to find the problem in the expat parser module that
comes with PyXML-0.6.3 and that leads to Python crashes on
Windows when an exception is thrown while parsing incorrect 
stuff like:

from xml.parsers import expat
import sys
po = expat.ParserCreate('ISO-8859-1')
po.Parse(u'<?xml encoding="iso-8859-1" ?><test></test>', 1)

(The xml version is missing in the above example)
I have found the following:

1. the problems will go away if you remove the 
_xmlplus/parsers/pyexpat.pyd extension.  Then the extension
supplied with Python2.0 will be used.  Because this has less
features, things like SAX2 will probably not work any more,
but xml.parsers.expat will be usable as well as features of
the XML package that do not require expat.

2. in the file pyexpat.c, the variable "ErrorObject" is not
initialized (there is a test for null in the init method
of the module).  This is clearly a bug, but unfortunately
not the (only) source of the problem.  ErrorObject should
be declared as:

static PyObject *ErrorObject = NULL;

3. Inserting debug prints into the function xmlparse_Parse(()
shows that the pointer ErrorObject gets destroyed while
parsing the incorrect XML.  It does not get destroyed when
correct XML is parsed.

4. If I put the line

static int *willNotBeUsed;

immediately after the declaration of ErrorObject, the module
becomes more stable - it did not crash anymore with my tests.
This cannot be the solution, though.  I have no idea right now
how to get this straight and little experience in debugging
at would appreciate it a lot if somebody else could look into
this.  This may be a problem with expat itself and not the
module?

- Guenter