XML vs Python?
mikeb at mitre.org
Sun Jan 19 13:57:09 CET 2003
I agree on some of the points, but disagree on two points.
Point 1. Python structures have development and execution speed.
I agree. Python can get even better than a list
of dictionaries, using a structure which is either
a LIst or a diCTionary at each level (a "lict").
Just recursively get the children. Notice
notice whether the attributes are unique to
decide whether to make a list or a dictionary.
Point 2. Objects are more obvious than python dictionaries.
I disagree. The one thing in mathematics one needs
to learn in order to do topology, category theory,
abstract algebra, homology, cohomology, braids, and K-Theory
is: the arrow. An arrow from A to B means that
"for each a in set A, there is exactly one b in set B."
>From this all more advanced mathematical concepts form.
The two most important of these mathematical concepts
which are based on the arrow diagram are:
- the SQL dependency model
- the Python dictionary model
In my opinion, the consecutive python dictionary references:
is easier to work with than:
And very similar abbreviations can be used for efficiency in both cases.
Point 3. A "standard" object model might be better.
I disagree. For example, in Java, the new JDOM
model makes it one level less deep to access the
nodes in an XML document. Since Java came up with
a new "standard" model that is easier to work with
and also faster at runtime, than DOM, it follows
that current standards are not perfect. JDOM is
just one step better than DOM. There are many steps
better to be gotten.
The same in Python. We are nowhere near a good
object model, nor does RDF, SVG, or any of the
other complex applications based on XML come
close to the level of the model we will shortly
We should always be looking to improve our data
models, and an object model is just one aspect
of our data model. Even objects themselves
should be reevaluated and thrown away when
something better is discovered.
For example, bottom-up object-oriented design
replaced top-down design just about everywhere
except at the requirements level where the
objects come together to actually solve a problem.
That happened because we need to test software
to make it work, but people don't like to test,
and bottom-up requires less testing. Therefore,
the replacement for object-oriented design will
be something that requires even less testing than
Of course, the answer is arrows. The replacement
for objects will be dataflow diagrams with
their constraints expresses as mathematical
dependencies between sets (that is, arrow
diagrams). UML use cases, dataflow diagrams,
hierarchy diagrams, finite state machines,
and Sequence Diagrams are just the beginning.
Someday someone will modify UML and Object
Oriented Design by:
- adding "arrow-dependencies" strong
enough to express geometric and
- changing the finite state machines
to colored Petri nets
- combining the sequence diagrams with
- adding sprite graphics to SVG
- adding XLINK, XSQL, etc., to browsers
- permitting browser to print multi-page diagrams
- enhancing activities to include the whole
2D hierarchy of processes-activities-tasks
annotating them with the tools used
- adding a default web rendering of data
as relational 2D tables like spreadsheets
with a SQL query in each cell, with
mouse adjustable-width cells, floating
headers, and updating just like a spreadsheet
only over the web (any other rendering
would require a few minutes of rendering
after which objects will be obsolete. This could happen
as early as 5 years from now.
> Before the XML heavyweights get in on this, I would make the following points:>
> * A lists and dictionaries approach certainly has its merits:
> speed (at least according to the PyRXP people), familiarity,
> interoperability with other Python stuff.
> * Once you start to look into more advanced features, it would
> seem to me that the lists and dictionaries model approaches
> such a level of complexity that a "proper" object model would
> be better. That is because you would need to find more
> efficient ways (in terms of expression) of requesting an
> attribute with a given namespace, for example, than examining
> raw dictionaries. Certainly, it appears to me that...
> attr = element.getAttributeNS(some_namespace, some_name)
> ...is more obvious than...
> attr = element[some_namespace][some_name]
> ...especially if it's hidden in lots of heavy XML processing
> * You could write your own object model. Frederik Lundh's
> ElementTree is like this. I'm not 100% convinced that it's
> beneficial to drop a standard object model that is widely
> understood for another which is more Pythonic, although this
> is a tradeoff that might be appropriate for some situations.
More information about the Python-list