lxml removing tag, keeping text order
Stefan Behnel
stefan_ml at behnel.de
Sat Oct 25 05:21:39 EDT 2008
Tim Arnold schrieb:
> Hi,
> Using lxml to clean up auto-generated xml to validate against a dtd; I need
> to remove an element tag but keep the text in order. For example
> s0 = '''
> <option>
> <optional> first text
> <someelement>ladida</someelement>
> <emphasis>emphasized text</emphasis>
> middle text
> <anotherelement/>
> last text
> </optional>
> </option>'''
>
> I want to get rid of the <emphasis> tag but keep everything else as it is;
> that is, I need this result:
>
> <option>
> <optional> first text
> <someelement>ladida</someelement>
> emphasized text
> middle text
> <anotherelement/>
> last text
> </optional>
> </option>
There's a drop_tag() method in lxml.html (lxml/html/__init__.py) that does
what you want. Just copy the code over to your code base and adapt it as needed.
Stefan
More information about the Python-list
mailing list