XML

Alan Kennedy alanmk at hotmail.com
Tue Jun 24 13:05:27 CEST 2003


"A.M. Kuchling" wrote:

> I've come to the conclusion that the initial concern for supporting
> APIs such as SAX and DOM in Python was a mistake; many bugs stem
> from trying to support interfaces that don't map to Python very well.
> Instead we should have made nice Pythonic interfaces such as effbot's
> ElementTree which are simpler to implement and to use, and ignore the
> W3C's APIs.

I must say that I, at least partly, agree with these statements.

I'm writing a tree processor at the moment, which works on DOM level 2
trees. The DOM 2 API is painful, especially for managing attributes (and
especially when you throw namespaces into the mix). It is almost
"anti-pythonic", in that it forces you to use a foreign style for
discovery and iteration of tree nodes: it is not pythonic at all.

But the other side of the coin is that, when my processor is finished,
it'll work with every DOM implementation there is: cDomlette, minidom,
MSXML DOM + win32all, pDomlette (still exist?), 4DOM, libxml, etc.
AND, it will work with all the Java DOMs, which gives an enormous
range of choice. So I'll be able to run my code under python or
jython, using any one of dozens of DOM implementations, which I can
choose based on features, or performance, or whatever.

The latter is an important point, because it is very often the case
that more than just node manipulation is required. For example, I
might need to access my DOM with xpath expressions, or xpointer. When
selecting a DOM impl, I just need to select one that has xpath
functionality support. (Indeed, the standard nature of XML object
models is the basis for such excellent products as the jaxen engine,
which is a (java) DOM-independent xpath processor.)

Although pythonic APIs like elementtree are obviously an excellent
solution for processing xml in python, what I would prefer to see is
some form of interface API where you could make pythonic calls against
a standard DOM, which then get turned into actual DOM standard method
calls. However, I can't see how this would work without a lot of data
copying and inefficiency (thus piling time inefficiency on top of DOM
time and memory inefficiencies).

just my 0,02 euro.

--
alan kennedy
-----------------------------------------------------
check http headers here: http://xhaus.com/headers
email alan:              http://xhaus.com/mailto/alan




More information about the Python-list mailing list