[XML-SIG] Repost: DOM notes

Stefane Fermigier fermigie@math.jussieu.fr
Tue, 21 Apr 1998 14:57:28 +0200

On Mon, Apr 20, 1998 at 12:35:19PM -0400, Andrew Kuchling wrote:
> I've written a first cut at a marshal module that converts a simple
> Python data structure to and from a simple XML representation, using
> the DOM implementation.  The code's available in the SIG archive.
> Some notes:
>         * There's one problem with xml.marshal at the moment; you
> can't pickle multiple objects to the same stream because, when you
> read the data again, the parser doesn't read one data item and stop,
> but reads them all.
>         For example, None is converted to a <none/> tag; if you pickle
> None to the same file object twice, you get <none/><none/>.  But when
> you parse this, the parser builds a tree containing both tags.  If an
> XML document must contain a single top-level element, then I think
> parsers should recognize when that top-level element has been
> completed and stop.  
>         Any thoughts on this question?  What's the correct behaviour?

I believe the parser should either parse only one element, or raise an
exeption, since the standards says that there must be only one to element
in one document. 

>         * The Walker class's walk1() method isn't consistent in
> returning values.  walk() does "return self.walk1()", but walk1() 
> never returns anything; this should probably be fixed.  For
> xml.marshal, I therefore overrode the walk1() method, but I'm not sure
> that's how Walker is intended to be used.

You should probably have written your walker from scratch (see below).

>         On the other hand, unmarshalling using just startElement(),
> endElement(), and doText() would have been more complicated, so
> overriding was the easiest thing to do.

You're right. walker.py was an attempt to write a generic walker class,
but a walker can have several goals (the one in walker.py just dispatches
events, but you could also call a function for each visited node, or
for each visited node for which some condition holds, or modify the
tree, or filter nodes,...)  so this should be designed more carefully.
Unfortunatly, I'm not familiar with the walker design pattern to do
that properly.



Stéfane Fermigier, MdC à l'Université Paris 7. Tel: (Bureau).
Mathematician, hacker, bassist.  http://www.math.jussieu.fr/~fermigie/
"In its pure form, Pascal is a toy language, suitable for teaching but not
for real programming."  Brian Kernighan.