fredrik at pythonware.com
Tue Jun 24 16:03:58 CEST 2003
Alan Kennedy wrote:
> Although pythonic APIs like elementtree are obviously an excellent
> solution for processing xml in python, what I would prefer to see is
> some form of interface API where you could make pythonic calls against
> a standard DOM, which then get turned into actual DOM standard
> method calls.
the elementtree API is really minimal; you can easily create an
elementtree-style binding for almost any hierarchial data structure
in a couple of hours.
I've done this for HDF5, Tkinter, low-overhead XML storage models
(allowing you to store in-memory trees in much less space than you
need to store them on disk), and applications specific data models
(using simple "schemas" to mutate elements into element subclasses,
so you can add arbitrary methods to your XML nodes, depending on
what they represent, but still serialize everything to XML with a
single generic operation).
> However, I can't see how this would work without a lot of data copying
> and inefficiency (thus piling time inefficiency on top of DOM time and
> memory inefficiencies).
You'll always pay a penalty when you need to convert from a non-
Python representation to a Python representation. You can improve
things by only converting data when you need it, caching converted
data in the element objects, and even caching common data values
centrally (e.g. tags and attribute names), but if you want to access
a string of characters as a Python string, you need to store those
characters in a Python string at some point.
But except from that, I don't really see a problem here, neither in
theory nor in practice (see above)
Besides, nothing is more inefficient than the current crop of Python
DOMs. Whatever you do, your program can only run faster.
More information about the Python-list