[XML-SIG] Working with non-compliant XML utilities

Andrew Clover and-xml at doxdesk.com
Thu Dec 11 12:46:05 EST 2003


Lowell Alleman <lalleman at mfps.com> wrote:

> Certain things that the XML spec say the parser shouldn't
> care about, this utility cares about.  Things like the order of attributes

Urgh. Nasty.

Well, you could try pxdom:

  http://www.doxdesk.com/software/py/pxdom.html

A special feature of this DOM implementation is that it will maintain a
fixed order of attributes, so you can rely on the output being in the order
you want.

> and whether an empty element is written as "<a></a>" or "</a>" need to be
> presented in a specific way.

Is it always one way or always the other, or a mix?

pxdom will use the short form where possible, unless you ask it to do
canonicalisation (using the DOM Level 3 'canonical-form' parameter).
Unfortunately if you did canonicalisation, the attribute order would be
changed. I might add a separate option as a non-standard extension to turn
off short-forms in 1.0 if anyone else would find it useful - alteratively,
hack line 4193 in version 0.9.

If you need to output short forms in some cases but not in others, that's a
bit more work. What you could do to fool the serialiser is put a Text node
of an empty string inside every element that you want to be output in the
longer form, eg.:

  element.appendChild(element.ownerDocument.createTextNode(''))

Just don't normalise it before you serialise or the empty text nodes will
disappear!

Actually, it looks like this trick works in minidom, too.

-- 
Andrew Clover
mailto:and at doxdesk.com
http://www.doxdesk.com/



More information about the XML-SIG mailing list