[lxml-dev] Text attribute is None when element has text
If I run the code: test = etree.XML('<root><a/>text</root>') for x in test.iter(): print("%s - %s"%(x.tag, x.text)) I get the output: root - None a - None I expected that root.text would have been 'text' rather than none. However, if I flip the text and <a/> tag, then it works. E.g. test = etree.XML('<root>text<a/></root>') for x in test.iter(): print("%s - %s"%(x.tag, x.text)) Output: root - text a - None Anyone know why this is, or how to work around it? Thanks! - Clif
Clif Swiggett wrote:
If I run the code:
test = etree.XML('<root><a/>text</root>') for x in test.iter(): print("%s - %s"%(x.tag, x.text))
I get the output:
root - None a - None
I expected that root.text would have been 'text' rather than none. However, if I flip the text and <a/> tag, then it works. E.g.
test = etree.XML('<root>text<a/></root>') for x in test.iter(): print("%s - %s"%(x.tag, x.text))
Output:
root - text a - None
Anyone know why this is, or how to work around it?
Sure, read the docs: http://codespeak.net/lxml/tutorial.html#elements-contain-text Stefan
On Wed, 2009-10-07 at 10:45 -0700, Clif Swiggett wrote:
If I run the code:
test = etree.XML('<root><a/>text</root>') for x in test.iter(): print("%s - %s"%(x.tag, x.text))
I get the output:
root - None a - None
I expected that root.text would have been 'text' rather than none. However, if I flip the text and <a/> tag, then it works. E.g.
test = etree.XML('<root>text<a/></root>') for x in test.iter(): print("%s - %s"%(x.tag, x.text))
Output:
root - text a - None
Anyone know why this is, or how to work around it? Thanks! - Clif
_______________________________________________ lxml-dev mailing list lxml-dev@codespeak.net http://codespeak.net/mailman/listinfo/lxml-dev
Take a look at the tutorial, especially the "Elements contain text" section near here: http://codespeak.net/lxml/tutorial.html#the-element-class Hopefully that will explain how .text and .tail work for accessing text. The short answer is that your text is in test.find( 'a' ).tail, though. The itertext method may also be useful for you, depending on your use case. -- John Krukoff <jkrukoff@ltgc.com> Land Title Guarantee Company
On Wed, 7 Oct 2009, Clif Swiggett wrote: +-- | If I run the code: | | test = etree.XML('<root><a/>text</root>') | for x in test.iter(): | print("%s - %s"%(x.tag, x.text)) | | I get the output: | | root - None | a - None | | I expected that root.text would have been 'text' rather than none. +-- lxml does not represent mixed content (text intermingled with elements) in the same way that most other XML tools do. I have attempted to explain this here: http://www.nmt.edu/tcc/help/pubs/pylxml/ The relevant section is here: http://www.nmt.edu/tcc/help/pubs/pylxml/etree-view.html Here's your interactive example with the .tail attribute shown. ================================================================ Python 2.5.1 (r251:54863, Jun 15 2008, 18:24:51) [GCC 4.3.0 20080428 (Red Hat 4.3.0-8)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
from lxml import etree as et test=et.XML('<root><a/>text</root>') for x in test.iter(): ... print ( "tag='%s' text='%s' tail='%s'" % ... (x.tag, x.text, x.tail) ) ... tag='root' text='None' tail='None' tag='a' text='None' tail='text'
Best regards, John Shipman (john@nmt.edu), Applications Specialist, NM Tech Computer Center, Speare 119, Socorro, NM 87801, (505) 835-5950, http://www.nmt.edu/~john ``Let's go outside and commiserate with nature.'' --Dave Farber
participants (4)
-
Clif Swiggett
-
John Krukoff
-
John W. Shipman
-
Stefan Behnel