lxml/ElementTree and .tail

Stefan Behnel stefan.behnel-n05pAM at web.de
Thu Nov 16 02:51:35 EST 2006


Hi,

Chas Emerick wrote:
> I looked around for an ElementTree-specific mailing list, but found none
> -- my apologies if this is too broad a forum for this question.

The lxml mailing list is always happy to receive feedback, but it's fine to
ask here if it's not lxml specific.


> I've been using the lxml variant of the ElementTree API.
> it shares the use of a .tail attribute.  I
> ran headlong into this aspect of the API while doing some DOM
> manipulations, and it's got me pretty confused.
> 
> Example:
> 
>>>> from lxml import etree as ET
>>>> frag = ET.XML('<a>head<b>inside</b>tail</a>')
>>>> b = frag.xpath('//b')[0]
>>>> b
> <Element b at 71cbe8>
>>>> b.text
> 'inside'
>>>> b.tail
> 'tail'
>>>> frag.remove(b)
>>>> ET.tostring(frag)
> '<a>head</a>'
> 
> As you can see, the .tail text is removed as part of the <b> element --
> but it IS NOT part of the <b> element.

Yes, it is. Just look at the API. It's an attribute of an Element, isn't it?
What other API do you know where removing an element from a data structure
leaves part of the element behind?

If you want to copy part of of removed element back into the tree, feel free
to do so.


> Performing the same operations with the Java DOM api
> (Sorry for the Java comparison, but that's where I first cut my teeth on
> XML, and that's where my expectations were formed.)
> 
> That's a pretty significant mismatch in functionality.

IMHO, DOM has a pretty significant mismatch with Python.


> I ran this issue past a few people I know who've worked with and written
> about ElementTree, and their response to this apparent divergence
> between the ET DOM API and "standard" DOM APIs was roughly: "that's just
> the way it is".

It's just a matter of understanding (or getting used to) the API. You might
want to stop thinking in terms of '<' and '>' and rather embrace the API
itself as a way to work with the XML Infoset (rather than the XML DOM).

Stefan



More information about the Python-list mailing list