ElementTree oddities
Mark Thomas
mark at thomaszone.com
Mon Sep 15 14:32:16 EDT 2008
Fredrik is correct, the text attribute only contains text before a
child element; tail contains the rest. It is indeed rather odd. For
comparison, here's how you would do it in lxml (http://codespeak.net/
lxml/index.html), a library which supports XPath:
from lxml import etree
tree = etree.fromstring('<highlight><ref>Bar</ref>:</highlight>')
print ' '.join(tree.xpath('//text()'))
More information about the Python-list
mailing list