On Wed, 7 Oct 2009, Clif Swiggett wrote: +-- | If I run the code: | | test = etree.XML('<root><a/>text</root>') | for x in test.iter(): | print("%s - %s"%(x.tag, x.text)) | | I get the output: | | root - None | a - None | | I expected that root.text would have been 'text' rather than none. +-- lxml does not represent mixed content (text intermingled with elements) in the same way that most other XML tools do. I have attempted to explain this here: http://www.nmt.edu/tcc/help/pubs/pylxml/ The relevant section is here: http://www.nmt.edu/tcc/help/pubs/pylxml/etree-view.html Here's your interactive example with the .tail attribute shown. ================================================================ Python 2.5.1 (r251:54863, Jun 15 2008, 18:24:51) [GCC 4.3.0 20080428 (Red Hat 4.3.0-8)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
from lxml import etree as et test=et.XML('<root><a/>text</root>') for x in test.iter(): ... print ( "tag='%s' text='%s' tail='%s'" % ... (x.tag, x.text, x.tail) ) ... tag='root' text='None' tail='None' tag='a' text='None' tail='text'
Best regards, John Shipman (john@nmt.edu), Applications Specialist, NM Tech Computer Center, Speare 119, Socorro, NM 87801, (505) 835-5950, http://www.nmt.edu/~john ``Let's go outside and commiserate with nature.'' --Dave Farber