[XML-SIG] Replicating DTD information using XMLFilterBase and XMLGenerator

Stefan Behnel stefan_ml at behnel.de
Tue Jul 29 07:42:03 CEST 2008


Hi,

James Sulak wrote:
> Thanks, Stefan, for pointing me to lxml.  It looks like a good
> alternative to SAX in this situation.  However, I'm a little confused
> as to the best way to remove elements from the tree while keeping
> their tail text.  This is what I have so far:
> 
> context = etree.iterparse("test.xml")
> 
> for event, element in context:
>     for title in element.xpath("child::title"):

it's likely faster to use

    for title in element.iterchildren("title"):

here.

>         element.remove(title)
> 
> Do I need to explicitly assign the tail text to either the parent or
> the preceding sibling?

Yes, the tail text is part of the Element object. Take a look at the
"drop_tree" and "drop_tag" methods in lxml.html.

http://codespeak.net/svn/lxml/trunk/src/lxml/html/__init__.py

Stefan


More information about the XML-SIG mailing list