[XML-SIG] Documenting DOM bugs
Andrew Clover
and-xml at doxdesk.com
Fri Jun 13 12:17:35 EDT 2003
Reading the Python manual and XML HOWTO can lead new users into believing
xml.dom will Just Work to spec. However this isn't really the case, and
there's no proper documentation of all the gotchas DOM applications can run
into. The PyXML bug tracker lists some of the problems but doesn't say which
versions of which packages have them; the changelog lists some bugs, but
usually rather vaguely.
Add this to the difficulty in determining exactly what version of the XML
tools is in use (most modules in the xml package do not provide any kind of
version number; those that do can change without their version number being
updated), and you've got a tricky situation for authors, especially authors
of library modules that need to be compatible across different setups.
For this reason I've compiled a table of support and bugs for DOM Level 2
Core/XML features that a typical DOM application might want to use. I've
tested the minidoms supplied with common Python versions and PyXML packages,
along with PyXML 4DOMs and - somewhat less thoroughly - the 4Suite Domlettes.
I believe it would be helpful to put this sort of thing (with updates)
somewhere fairly visible, to stop authors getting confused when their DOM
applications don't work.
(I haven't included any bugs to do with document types, internal subsets,
validation, attribute normalisation or defaulting, which most simple DOM
applications don't need, or any non-Core/XML feature.)
You'll notice there are still a few problems listed here in current versions
of PyXML DOMs: not as bad mistakes as existed in some of the old stuff, but
compliance issues which seem to be caused by deliberate design decisions. Is
there any interest in patching [clmpu]?
Package: Python PyXML 4Suite
Implementation: minidom minidom 4DOM cD pD FtM
Package versions: 201 222 066 080 066 080 0111 0111
212 23a2 071 082 071 082 10a1 10a1
Features:
namespace declarations a a * * * * * * bc c c c * * * *
built-in xml namespace de de ef * de * * * * * * * * * * *
Element.namespaceURI g * * * * * * * g * * * g * g *
Element.prefix gh gh h h gh h h h g * * * g * g *
Attr.namespaceURI g g * * g * * * g * * * g * g *
Attr.prefix gh gh h h hi h h h g * * * g * g *
Attr.childNodes j k k l k k l l * * * * l k l k
NodeList.item, length - - * * - m m m * * * * - - - -
NamedNodeMap.item, length * * * * * * * * * * * * - - * -
NamedNodeMap.NamedItem[NS] - * * * * * * * * * * * - - - -
Node.[previous|next]Sibling n * * * * * * * * * * * * * * *
Node.insertBefore o * * * * * * * * * * * * * * *
createElement/Attribute p p p p p p p p * * * * - - p -
createElementNS/AttributeNS * * * * * * * * * * * * - * * *
hasAttribute - * * * * * * * q * * * r - * -
getAttribute s * * * * * * * q * * * r - * -
getAttributeNode * * * * * * * * q * * * r - * -
setAttribute s * * * * * * * q * * * - - - -
setAttributeNode * * * * * * * * * * * * - - - -
removeAttribute s * * * * * * * q * * * - - - -
getAttributeNS s * * * * * * * * * * * t t t *
getAttributeNodeNS s * * * * * * * * * * * t t t *
setAttributeNS - * * * * * * * * * * * - * * *
setAttributeNodeNS - * * * * * * * * * * * - * * *
removeAttributeNS s * * * * * * * * * * * - t t *
getElementsByTagName u u u u u u u u u u u u - - - -
importNode - - - * - - * * * * * * - * - *
cloneNode v * * * * * w * * * * * - * * *
DocumentFragment - x * * x * * * * * * * - * * *
DOMImplementation - * * * * * * * * * * * * * * *
EntityReference - - - - - - - - y y y y - - - -
Comment z z z * z z z * * * * * * * * *
Key: *. ok -. no support a-z. bugs (blame).
a. namespace declarations not included in resultant DOM.
(dom.pulldom.PullDOM.StartPrefixMapping)
b. for non-default namespace declarations, localName and prefix are the wrong
way round.
(dom.ext.__init__.SplitQName)
c. for default namespace declarations, localName and prefix are the wrong way
round.
(dom.ext.__init__.SplitQName)
d. does not split xml:... into a namespace attribute on non-root elements.
(pyexpat)
e. generates a KeyError on parsing xml:... in the documentElement
(dom.pulldom.PullDOM._current_context)
f. generates an IndexError on parsing xml:... in any element
(sax.expatreader.ExpatParser._ns_stack)
g. yields '' instead of None for null values.
h. prefix is guessed from namespaceURI, not necessarily original prefix
(consequence of using pyexpat with namespace processing on)
i. yields '' instead of None for default namespace declarations.
j. is None
(xml.dom.minidom.Attr)
k. always an empty list
(xml.dom.minidom.Attr)
l. Only read access works. In 4Suite 0.11.1's cDomlette the childNodes are
at least readonly; in other DOMs changing the children puts the DOM into
an inconsistent state.
m. only provided when running under Python 2.2 or later
(xml.dom.minidom.NodeList)
n. Changes to the hierarchy caused my nodes being moved out when inserted
into other places are not reflected in sibling pointers.
(xml.dom.minidom.insertBefore etc.)
o. insertBefore(..., None) doesn't work
(xml.dom.minidom.insertBefore)
p. Returns a node with localName set to its nodeName - localName should be
null
q. fails on documents with namespaces, you have to pass in (uri, local).
(xml.dom.minidom)
r. Never gets the attribute, always returns 0/''/None.
(Ft.Lib.cDomlettec)
s. exception raised on non-existant attribute instead of returning a null.
(xml.dom.minidom.Node)
t. Default namespace declarations are indexed as having a localName of
''/None even though their localName is, correctly, 'xmlns'. Because
None cannot be passed as a localName in cDomlette 1.0a1, default namespace
declarations become inaccessible.
u. NodeLists returned by getElementsByTagName[NS] are not 'live'
(xml.dom.minidom.NodeList etc.)
v. raises AttributeError due to attempt to copy() a None or AttributeList
(xml.dom.minidom.Node.cloneNode)
w. fails with NameError when node is Attr, ProcessingInstruction, Comment,
DocumentFragment or Document, or, when deep==True, when childNodes tree
contains a ProcessingInstruction or Comment.
(xml.dom.minidom._clone_node)
x. insertion/replacement with fragments fails due to destructive list
iteration.
(xml.dom.minidom.Node.insertBefore, xml.dom.minidom.Node.appendChild)
y. Document.createEntityReference returns EntityReference objects without the
childNodes filled in. Entity references parsed from text are always
replaced with their content; undefined entity references always cause an
error.
z. Comment objects exist but are not included in DOMs parsed from text.
--
Andrew Clover
mailto:and at doxdesk.com
http://www.doxdesk.com/
More information about the XML-SIG
mailing list