Mailman 3 [lxml-dev] Custom element class lookup mechanisms - lxml - The Python XML Toolkit

July 24, 2006

      Hi all,

as I was working on the C-API anyway (capi branch), I decided to add a little
external module with different ways of determining the Python element class
for a libxml2 node. The "lxml.elements.classlookup" module currently
implements three different ways of doing this:

* ElementDefaultClassLookup always uses the default class
* ElementNamespaceClassLookup is the default namespace lookup mechanism
* AttributeBasedElementClassLookup determines the class by looking up the
value of a specific attribute in a dict. It falls back to the default classes.

Other ways are of cause possible, so if anyone has an idea what to add, I'm
open for suggestions.

An example usage is this:

    from lxml.elements import classlookup
    classlookup.setElementClassLookup(
        classlookup.ElementDefaultClassLookup())

It registers the mechanism that always uses the default class for elements,
comments and PIs (yes, I implemented that, too). This disables the namespace
class lookup and thus speeds up the plain element object creation by up to 10%.

Example usage for attribute based lookup:

    mydict = {'int' : IntElement, 'str' : StrElement}
    classlookup.setElementClassLookup(
        classlookup.AttributeBasedElementClassLookup('pytype', mydict))

    root = etree.XML('<x><a pytype="int">5</a><b pytype="str">test</b></x>')

Internally, the lookup function is registered using the public C-API function
"setElementClassLookupFunction()" and must be implemented in Pyrex (or C). It
takes an object and the xmlNode* as arguments. The object can be used to keep
some status, such as the attribute name and class dict in the
AttributeBasedElementClassLookup case. It is registered together with the
lookup function, passed as first argument on each call and otherwise ignored
by lxml.

The return value of the lookup function is a callable Python object (typically
a subtype of _Element) that returns an element instance.

The C API itself is briefly described here:
http://codespeak.net/svn/lxml/branch/capi/doc/capi.txt

Hope this is useful,
Stefan

[lxml-dev] Custom element class lookup mechanisms

Stefan Behnel

Andrew Lutomirski

Stefan Behnel

Stefan Behnel

tags

participants (2)