Dec. 2, 2008
11:23 a.m.
Ian Bicking wrote:
for el in list(doc.iter()): if el.tag not in ['a']: el.drop_tag()
I'm not 100% sure what happens if you modify the tree in place like this, though I think list() will make it work.
It will at least refuse to drop the root element. Running through list(root.iterdescendants()) should work, though, although the above will definitely not result in a valid HTML document. If you are really only interested in a couple of tags without a meaningful structure, you should collect them in a list rather than cutting everything else out of the document (which is quite costly). Stefan