lxml/ElementTree and .tail
fredrik at pythonware.com
Thu Nov 16 13:25:00 CET 2006
Chas Emerick wrote:
> might be represented as:
> <Element a: head='', text='last'>
> <Element b: head='first', text='middle'>
sure, and you could use a text subtype instead that kept track of the
elements above it, and let the elements be sequences of their siblings
instead of their children, and perhaps stuff everything in a dictionary.
such a construct would also be able to hold the same data, and be very
hard to use in most normal situations.
> If I'm wrong, just chalk it up to the fact that this is the first
> time I've ever looked at the Infoset spec, and I'm simply confused.
the Infoset spec *is* the essence of XML; if you don't realize that an
XML document is just a serialization of a very simple data model, you're
bound to be fighting with XML all the time.
but ET doesn't implement the Infoset spec as it is, of course: it uses a
*simplified* model, carefully optimized for the large percentage of all
XML formats that simply doesn't use mixed content. if you're doing
document-style processing, you sometimes need to add an extra assignment
or two, but unless you're doing *only* document-style processing, ET's
API gives you a net win. (and even if you're doing only document-style
processing, ET's speed and memory footprint gives you a net win over
most competing technologies).
More information about the Python-list