[XML-SIG] turning of dtd checker

Radovan Chytracek Radovan.Chytracek at cern.ch
Mon Oct 27 05:15:15 EST 2003


> Paul Tremblay <phthenry at earthlink.net> writes:
> 
> > Is there a way to run SAX if the dtd in the document points 
> to a url, 
> > and the url cannot be retrieved?
> 
> You need to turn off processing of external general entities:
> 
> p.setFeature("http://xml.org/sax/features/external-general-ent
> ities",False)
> 
> In that case, the parser won't attempt to resolve the 
> external DTD subset, or references to external entities 
> defined in the internal DTD subset.
> 
> Regards,
> Martin

Well, this is solution only to the problem where one wants to parse an
XML document successfully in any case whether a DTD is accessible or
not. This might not be a correct approach because even if parsing in
non-validating mode the DTD might still contain some entities important
to the dosument structure.

On the other hand, nobody has anwereed my question where I needed to
guarantee, that an XML document parsing succeeds including references to
all external entities even in the case a DTD is not physically
accessible. This is required for disconnected mode of operation where
user has no network connection available on his/her laptop for example.
The solution to this is to use EntityResolver which for some reaqson
does not really work as expected. Well it works if one enables it but my
feeling is that for some reasons it has been intentionally disabled so
even if one implements and registers EntityResolver it is not called and
causes not really intuitive exception(s) being raised.

I would like to know what's a showstooper for EntityResolver proper
function. I suspect there is some infrastructure missing between SAX API
and parsers in PyXML.

Cheers

            Radovan



More information about the XML-SIG mailing list