XML

Paul Boddie paul at boddie.net
Wed Jun 25 05:45:40 EDT 2003


"A.M. Kuchling" <amk at amk.ca> wrote in message news:<vu-cnZQgo5841mWjXTWcrg at speakeasy.net>...
> On 24 Jun 2003 01:52:27 -0700, 
> 	Paul Boddie <paul at boddie.net> wrote:
> > The thing is that Python and its developers quite often have to live
> > (and work) alongside other technologies; having a set of common APIs
> > is important if you consider them in that context.
> 
> In practice, there's no way to access those other technologies from Python.
> There's a Python wrapper for the Xerces DOM implementation, but I never hear
> about anyone using it; there's a wrapper for libxml2, but it has its own 
> API that's somewhat similar to ElementTree (but not as nice to use --
> someone should fix that, because libxml2 is blazingly fast).

Yes, it's very tempting to write a PyXML-style DOM API for it. Then we
can use XPath (whether it be from PyXML or 4Suite) on our documents
without having to port our source code just because some underlying
implementation detail has changed.

> So any DOM implementation you use will likely have been built by the Python
> world, and could have been written to a standard interface. Jython users could
> use the Python interface or use the Jython mapping of Java interfacers.

There is/was a certain amount of Java API support in PyXML that I
wouldn't mind looking more closely at with Jython. PyXML's DOM API
isn't radically different from the JAXP API, but there are advantages
in being able to play with both standards, and the marginal benefits
of PyXML's API are good enough for me at least.

> When you think about it: how useful is it that the Python DOM interface uses
> the same method names as the Java or Perl interface?  What's gained by this?

Experience that is portable between the languages/environments.
Otherwise, one could easily be left guessing about certain mechanisms.
How does one represent node names in the canonical "Pythonic" API?
Simple strings with or without prefixes which are registered elsewhere
(cf. XPath), tuples with such prefixes and local-names, tuples with
namespaces and local-names? And so on...

[...]

> In this context, I find the existence of jDOM, a Java-centric DOM-like API, 
> to support this view.  There's even a jDOM JSR, the Java world's equivalent
> of a PEP.

Yes, there's JDOM, DOM4J, XOM and loads of other Java packages that
all have subtle incompatibilities and are only interchangeable thanks
to packages like Jaxen. It doesn't help that even Sun Microsystems mix
some of them together in their more recent, bizarre APIs (where they
could have just used the DOM in many respects anyway), meaning that
all of a sudden you're dealing with DOM4J objects which have method
names which are just slightly different from the official DOM method
names, and you find yourself asking, "Why TF did they see it as
absolutely necessary to make their interfaces different even though
the methods do practically the same thing? Is it just to be
fashionably different?"

In my opinion, one of the things that has made Java more attractive
for XML processing has been the increasing standardisation. Once JAXP
broke through, it arguably became less likely that one would come up
against code that was specifically written for a particular special
API. Moreover, one would need a very good reason to specifically
target something like DOM4J these days, in my opinion.

Anyway, those are my perspectives, but I guess they have evolved in
their own way because I unfortunately have to spend pretty much all of
my working hours not using Python.

Paul




More information about the Python-list mailing list