[XML-SIG] Documenting DOM bugs

Andrew Clover and-xml at doxdesk.com
Fri Jun 13 12:17:35 EDT 2003


Reading the Python manual and XML HOWTO can lead new users into believing
xml.dom will Just Work to spec. However this isn't really the case, and
there's no proper documentation of all the gotchas DOM applications can run
into. The PyXML bug tracker lists some of the problems but doesn't say which
versions of which packages have them; the changelog lists some bugs, but
usually rather vaguely.

Add this to the difficulty in determining exactly what version of the XML
tools is in use (most modules in the xml package do not provide any kind of
version number; those that do can change without their version number being
updated), and you've got a tricky situation for authors, especially authors
of library modules that need to be compatible across different setups.

For this reason I've compiled a table of support and bugs for DOM Level 2
Core/XML features that a typical DOM application might want to use. I've
tested the minidoms supplied with common Python versions and PyXML packages,
along with PyXML 4DOMs and - somewhat less thoroughly - the 4Suite Domlettes.
I believe it would be helpful to put this sort of thing (with updates)
somewhere fairly visible, to stop authors getting confused when their DOM
applications don't work.

(I haven't included any bugs to do with document types, internal subsets,
validation, attribute normalisation or defaulting, which most simple DOM
applications don't need, or any non-Core/XML feature.)

You'll notice there are still a few problems listed here in current versions
of PyXML DOMs: not as bad mistakes as existed in some of the old stuff, but
compliance issues which seem to be caused by deliberate design decisions. Is
there any interest in patching [clmpu]?

                  Package:  Python      PyXML                   4Suite
           Implementation:  minidom     minidom     4DOM        cD    pD FtM
         Package versions:  201   222   066   080   066   080   0111  0111
                               212   23a2  071   082   071   082   10a1  10a1
Features:
namespace declarations      a  a  *  *  *  *  *  *  bc c  c  c  *  *  *  *
built-in xml namespace      de de ef *  de *  *  *  *  *  *  *  *  *  *  *
Element.namespaceURI        g  *  *  *  *  *  *  *  g  *  *  *  g  *  g  *
Element.prefix              gh gh h  h  gh h  h  h  g  *  *  *  g  *  g  *
Attr.namespaceURI           g  g  *  *  g  *  *  *  g  *  *  *  g  *  g  *
Attr.prefix                 gh gh h  h  hi h  h  h  g  *  *  *  g  *  g  *
Attr.childNodes             j  k  k  l  k  k  l  l  *  *  *  *  l  k  l  k
NodeList.item, length       -  -  *  *  -  m  m  m  *  *  *  *  -  -  -  -
NamedNodeMap.item, length   *  *  *  *  *  *  *  *  *  *  *  *  -  -  *  -
NamedNodeMap.NamedItem[NS]  -  *  *  *  *  *  *  *  *  *  *  *  -  -  -  -
Node.[previous|next]Sibling n  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *
Node.insertBefore           o  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *
createElement/Attribute     p  p  p  p  p  p  p  p  *  *  *  *  -  -  p  -
createElementNS/AttributeNS *  *  *  *  *  *  *  *  *  *  *  *  -  *  *  *
hasAttribute                -  *  *  *  *  *  *  *  q  *  *  *  r  -  *  -
getAttribute                s  *  *  *  *  *  *  *  q  *  *  *  r  -  *  -
getAttributeNode            *  *  *  *  *  *  *  *  q  *  *  *  r  -  *  -
setAttribute                s  *  *  *  *  *  *  *  q  *  *  *  -  -  -  -
setAttributeNode            *  *  *  *  *  *  *  *  *  *  *  *  -  -  -  -
removeAttribute             s  *  *  *  *  *  *  *  q  *  *  *  -  -  -  -
getAttributeNS              s  *  *  *  *  *  *  *  *  *  *  *  t  t  t  *
getAttributeNodeNS          s  *  *  *  *  *  *  *  *  *  *  *  t  t  t  *
setAttributeNS              -  *  *  *  *  *  *  *  *  *  *  *  -  *  *  *
setAttributeNodeNS          -  *  *  *  *  *  *  *  *  *  *  *  -  *  *  *
removeAttributeNS           s  *  *  *  *  *  *  *  *  *  *  *  -  t  t  *
getElementsByTagName        u  u  u  u  u  u  u  u  u  u  u  u  -  -  -  -
importNode                  -  -  -  *  -  -  *  *  *  *  *  *  -  *  -  *
cloneNode                   v  *  *  *  *  *  w  *  *  *  *  *  -  *  *  *
DocumentFragment            -  x  *  *  x  *  *  *  *  *  *  *  -  *  *  *
DOMImplementation           -  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *
EntityReference             -  -  -  -  -  -  -  -  y  y  y  y  -  -  -  -
Comment                     z  z  z  *  z  z  z  *  *  *  *  *  *  *  *  *

Key:  *. ok  -. no support  a-z. bugs (blame).

a. namespace declarations not included in resultant DOM.
   (dom.pulldom.PullDOM.StartPrefixMapping)
b. for non-default namespace declarations, localName and prefix are the wrong
   way round.
   (dom.ext.__init__.SplitQName)
c. for default namespace declarations, localName and prefix are the wrong way
   round.
   (dom.ext.__init__.SplitQName)
d. does not split xml:... into a namespace attribute on non-root elements.
   (pyexpat)
e. generates a KeyError on parsing xml:... in the documentElement
   (dom.pulldom.PullDOM._current_context)
f. generates an IndexError on parsing xml:... in any element
   (sax.expatreader.ExpatParser._ns_stack)
g. yields '' instead of None for null values.
h. prefix is guessed from namespaceURI, not necessarily original prefix
   (consequence of using pyexpat with namespace processing on)
i. yields '' instead of None for default namespace declarations.
j. is None
   (xml.dom.minidom.Attr)
k. always an empty list
   (xml.dom.minidom.Attr)
l. Only read access works. In 4Suite 0.11.1's cDomlette the childNodes are
   at least readonly; in other DOMs changing the children puts the DOM into
   an inconsistent state.
m. only provided when running under Python 2.2 or later
   (xml.dom.minidom.NodeList)
n. Changes to the hierarchy caused my nodes being moved out when inserted
   into other places are not reflected in sibling pointers.
   (xml.dom.minidom.insertBefore etc.)
o. insertBefore(..., None) doesn't work
   (xml.dom.minidom.insertBefore)
p. Returns a node with localName set to its nodeName - localName should be
   null
q. fails on documents with namespaces, you have to pass in (uri, local).
   (xml.dom.minidom)
r. Never gets the attribute, always returns 0/''/None.
   (Ft.Lib.cDomlettec)
s. exception raised on non-existant attribute instead of returning a null.
   (xml.dom.minidom.Node)
t. Default namespace declarations are indexed as having a localName of
   ''/None even though their localName is, correctly, 'xmlns'. Because
   None cannot be passed as a localName in cDomlette 1.0a1, default namespace
   declarations become inaccessible.
u. NodeLists returned by getElementsByTagName[NS] are not 'live'
   (xml.dom.minidom.NodeList etc.)
v. raises AttributeError due to attempt to copy() a None or AttributeList
   (xml.dom.minidom.Node.cloneNode)
w. fails with NameError when node is Attr, ProcessingInstruction, Comment,
   DocumentFragment or Document, or, when deep==True, when childNodes tree
   contains a ProcessingInstruction or Comment.
   (xml.dom.minidom._clone_node)
x. insertion/replacement with fragments fails due to destructive list
   iteration.
   (xml.dom.minidom.Node.insertBefore, xml.dom.minidom.Node.appendChild)
y. Document.createEntityReference returns EntityReference objects without the
   childNodes filled in. Entity references parsed from text are always
   replaced with their content; undefined entity references always cause an
   error.
z. Comment objects exist but are not included in DOMs parsed from text.

-- 
Andrew Clover
mailto:and at doxdesk.com
http://www.doxdesk.com/



More information about the XML-SIG mailing list