[XML-SIG] More on Python DTD parser?

lpd at major2nd.com lpd at major2nd.com
Mon Apr 5 04:48:18 CEST 2010


Dear XML-SIG,

I'm sorry to impose on you, but I've had a very frustrating afternoon trying
to report a couple of Python expat bugs on the PSF bug tracker.  SourceForge
appears to have lost all of my account data (for the second time), and when
I tried to register separately on the bug tracker site, the registration
process said "An unexpected error occurred during the processing of your
message" and failed to complete.

I'm using the Ubuntu Linux 8.04 distribution, which includes Python 2.5.2.
The libexpat1 version is 2.0.1-0ubuntu1.1 (hardy-updates), but I don't know
whether Python uses this or includes its own copy of expat.

The smaller problem -- but one that still led me to waste a fair bit of time
-- is that the SetParamEntityParsing method of xmlparser objects is simply
missing from the documentation of xml.parsers.expat.  Unfortunately, the
default is to not parse parameter entities, even when reading external DTDs,
so calling this method is required for DTDs that use parameter entities.  I
finally discovered this method by going to the expat Web site and look at
the C API.

I checked the Python doc for 2.6.5, and this method is still missing.

The larger problem is that often (but not always) when the Parse() method of
xmlparser returns after completely parsing a file, something happens at the
implementation level that results in a completely bogus Python error
"TypeError: An integer is required."  The error may occur a few Python
statements later, which suggests to me that it is a memory bookkeeping
problem of some kind, but I have no idea how to track it down.  However, it
is totally repeatable, and I can provide a very simple example (a 220-line
Python driver, most of which isn't executed, and a 7-line DTD) that triggers
the problem.

I checked the PSF bug tracker, and I thought that this might be the same bug
as # 6676, but my test case doesn't call ParseFile more than once on the
same parser instance.

I would really prefer not to upgrade to a later Python version, especially
not to 2.6 or later, but if this bug has been fixed, I'm willing to consider
it.

As soon as the registration issue gets cleared up, I'll report these issues
properly, but meanwhile, I was wondering if either of them (especially the
execution error, which has me stalled right now) rings a bell with anyone.

		Thanks -

						L Peter Deutsch


More information about the XML-SIG mailing list