lxml: traverse xml tree and retrieve element based on an attribute

Stefan Behnel stefan_ml at behnel.de
Sat May 30 04:00:22 EDT 2009


byron wrote:
> I am using the lxml.etree library to validate an xml instance file
> with a specified schema that contains the data types of each element.
> This is some of the internals of a function that extracts the
> elements:
> 
>         schema_doc = etree.parse(schema_fn)
>         schema = etree.XMLSchema(schema_doc)
> 
>         context = etree.iterparse(xml_fn, events=('start', 'end'),
> schema=schema)
> 
>         # get root
>         event, root = context.next()
> 
>         for event, elem in context:
>             if event == 'end' and elem.tag == self.tag:
>                 yield elem
>             root.clear()

Note that you cannot modify the root element during iterparse() in
lxml.etree. It seems to work for you here, but it's not safe. Here's a
better way to do this.

http://www.ibm.com/developerworks/xml/library/x-hiperfparse/#N100FF

Stefan



More information about the Python-list mailing list