[XML-SIG] Writing SAX-drivers

Alexandre Fayolle Alexandre.Fayolle@logilab.fr
Fri, 9 Nov 2001 11:09:32 +0100 (CET)


On Fri, 9 Nov 2001, M.-A. Lemburg wrote:

> I haven't been able to find any documentation on writing SAX-driver
> for PyXML -- is there any ? (I've looked at the code in PyXML and 
> the PyXML docs, but the code is undocumented and the docs only 
> mention *using* SAX interface compatible parsers.)
> 
> Or does PyXML simply use a well-defined standard for drivers
> which I can lookup on some web-site ?
> 
> Thanks for any pointers,

For very simple parsers (parsing non XML data, but this is not the point,
you may want to give a look at pypasax (SAX parser for python code
http://www.logilab.org/pypasax) and vcalsax (SAX parser for VCAL files
http://www.logilab.org/vcalsax))

At the very least the parser should inherit from xml.sax.saxlib.XMLReader
class (actually defined in xml.sax.xmlreader). It should provide
implementation for the parse(source) method. Other methods of note are the
set/getProperty methods. 

The parse method should call the self._cont_handler.setLocator() method
before doing anything else (or omit the call altogether if it doesn't want
to or cannot provide a locator object). Then during the parse, it can call
the callbakcs of the various registered handlers, most notably
startDocument(), endDocument(), startElementNS(), endElementNS(),
characters(). 

For comprehensive documentation on the order of calls, you may want to
give a look at the javadoc documentation of SAX2 available at
http://sax.sourceforge.net/apidoc/org/xml/sax/XMLReader.html, you'll need
to doublecheck for pythonisms in the xml.sax.saxlib module (which provides
nice default implementatoin for a number of interfaces required by
XMLReader, such as Attributes and InputSource. 

Alexandre Fayolle
-- 
LOGILAB, Paris (France).
http://www.logilab.com   http://www.logilab.fr  http://www.logilab.org
Narval, the first software agent available as free software (GPL).