lxml/ElementTree and .tail

Fredrik Lundh fredrik at pythonware.com
Sat Nov 18 11:09:07 CET 2006

Uche Ogbuji wrote:

> I certainly have never liked the aspects of the ElementTree API under
> present discussion.  But that's not as important as the fact that I
> think the above statement is misleading.  There has always been a
> battle in XML between the people who think the serialization is
> preeminent, and those who believe some data model is preeminent, but
> the reality is that XML 1.0 (an 1.1) is a spec *defined* by its
> serialization.

sure, the computing world is and has always been full of people who want 
the simplest thing to look a lot harder than it actually is.  after all, 
*they* spent lots of time reading all the specifications, they've bought 
all the books, and went to all the seminars, so it's simply not fair 
when others are cheating.

in reality, *all* interchange formats are easier to understand and use 
if you focus on a (complete or intentionally simplified) data model of 
the things being interchanged, and treat various artifacts of the 
byte-stream used by the wire format as artifacts, historical accidents 
based on what specification happened to be written before the other, or 
what some guy did or did not do in the seventies, as accidents, and 
esoteric arcana disseminated on limited-distribution mailing lists as 
about as relevant for your customer as last week's episode of American Idol.

(XML is a bit unusual in this respect, but that's probably just some 
variation of the bikeshed effect.  it's just text, and everyone with
a keyboard knows what that is, so we don't need to use established 
software engineering practices, or think about security *at all* 
(Billion laughs? XXE?) or, for that matter, learn from people who's
been doing data interchange in other domains since the dawn of time. 
and when they do appear anyway, and mess with our technology in ways 
that we haven't authorized, without reading our books or going to our 
seminars or subscribing to our mailing lists, we can write them off as 
"clueless muppet teenage genius code-jockeys", and keep patting our- 
selves on the back, while the rest of the world is busy routing around 
us, switching to well-understood XML subsets or other serialization 
formats, simpler and more flexible data models, simpler API:s, and
more robust code.  and Python ;-)


More information about the Python-list mailing list