replacing xml elements with other elements using lxml

Ultrus owntheweb at
Thu Aug 30 01:04:15 CEST 2007

I'm honored by your response.

You are correct about the bad xml. I attempted to shorten the xml for
this example as there are other tags unrelated to this issue in the
mix. Based on your feedback, I was able to make following fully
functional code using some different techniques:

from lxml import etree
from StringIO import StringIO
import random

sourceXml = "\
 <contents>Stefan's fortune cookie:</contents>\
     <contents>You will always know love.</contents>\
     <contents>You will spend it all in one place.</contents>\
   <contents>Your life comes with a lifetime warrenty.</contents>\
 <contents>The end.</contents>\

parser = etree.XMLParser(ns_clean=True, recover=True,
remove_blank_text=True, remove_comments=True)
tree = etree.parse(StringIO(sourceXml), parser)
xml = tree.getroot()

def reduceRandoms(xml):
	for elem in xml:
		if elem.tag == "random":
			elem.getparent().replace(elem, random.choice(elem)[0])

for elem in xml:
	print elem.tag, ":", elem.text

One challenge that I face now is that I can only replace a parent
element with a single element. This isn't a problem if an <item>
element only has 1 <contents> element, or just 1 <random> element
(this works above). However, if <item> elements have more than one
child element such as a <contents> element, followed by a <random>
element (like children of <theroot>), only the first element is used.

Any thoughts on how to replace+append after the replaced element, or
clear+append multiple elements to the cleared position?

Thanks again :)

More information about the Python-list mailing list