lxml/ElementTree and .tail

Fredrik Lundh fredrik at pythonware.com
Thu Nov 16 13:25:00 CET 2006

Chas Emerick wrote:

 > might be represented as:
 > <Element a: head='', text='last'>
 >      <Element b: head='first', text='middle'>

sure, and you could use a text subtype instead that kept track of the 
elements above it, and let the elements be sequences of their siblings 
instead of their children, and perhaps stuff everything in a dictionary. 
  such a construct would also be able to hold the same data, and be very 
hard to use in most normal situations.

> If I'm wrong, just chalk it up to the fact that this is the first  
> time I've ever looked at the Infoset spec, and I'm simply confused.   

the Infoset spec *is* the essence of XML; if you don't realize that an 
XML document is just a serialization of a very simple data model, you're 
bound to be fighting with XML all the time.

but ET doesn't implement the Infoset spec as it is, of course: it uses a 
*simplified* model, carefully optimized for the large percentage of all 
XML formats that simply doesn't use mixed content.  if you're doing 
document-style processing, you sometimes need to add an extra assignment 
or two, but unless you're doing *only* document-style processing, ET's 
API gives you a net win.  (and even if you're doing only document-style 
processing, ET's speed and memory footprint gives you a net win over 
most competing technologies).


More information about the Python-list mailing list