expat and entity references

Martin v. Löwis loewis at informatik.hu-berlin.de
Mon Apr 29 07:44:50 EDT 2002


clgonsal at alumni.uwaterloo.invalid (C. Laurence Gonsalves) writes:

> I'm working on an Python application that uses xml.parsers.expat to
> parse XML files.  These XML files are generally hand-written. I'd like
> to be able support more than just amp, lt, gt, apos and quot.
> 
> What's the best way to do this without requiring that every XML file
> contain a huge header declaring all of the entities I want to support? 

If you are using expat directly, you need to implement an
ExternalEntityRef handler. In that handler, you can either put the
entity definition right away into your processing results, or you need
to call ExternalEntityParserCreate to create a "nested parser" - the
latter is necessary if the external entity may contain markup.

If you are using SAX, you need to set an EntityResolver with the
parser, and return an InputSource in response to the resolveEntity
callback.

If you are using minidom.parse, you need to first create an
appropriately configured SAX parser, and pass that to build the
minidom tree.

HTH,
Martin



More information about the Python-list mailing list