[XML-SIG] Issues with XMLTreeBuilder in cElementTree and ElementTree (Cross-post from comp.lang.python)

Michael Becker spammb at gmail.com
Thu Mar 20 23:27:09 CET 2008

I had some xmls being output by an application whose formatting did
not allow for easy editing by humans so I was trying to write a short
python app to pretty print xml files. Most of the data in these xml
files is in the attributes so I wanted each attribute on its own line.
I wrote a short app using xml.etree.ElementTree.XMLTreeBuilder(). To
my dismay the attributes were getting reordered. I found that the
implementation of XMLTreeBuilder did not make proper use of the
ordered_attributes attribute of the expat parser (which it defaults
to). The constructor sets ordered_attributes = 1 but then the
_start_list method iterates through the ordered list of attributes and
stores them in a dictionary! This is incredibly unintuitive and seems
to me to be a bug. I would recommend the following changes to

class XMLTreeBuilder:
    def _start_list(self, tag, attrib_in):
        fixname = self._fixname
        tag = fixname(tag)
        attrib = []
        if attrib_in:
            for i in range(0, len(attrib_in), 2):
        return self._target.start(tag, attrib)

class _ElementInterface:

    def items(self):
            return self.attrib.items()
        except AttributeError:
            return self.attrib

These changes would allow the user to take advantage of the
ordered_attributes attribute in the expat parser to use either ordered
or unorder attributes as desired. For backwards compatibility it might
be desirable to change XMLTreeBuilder to default to ordered_attributes
= 0. I've never submitted a bug fix to a python library so if this
seems like a real bug please let me know how to proceed.

Secondly, I found a potential issue with the cElementTree module. My
understanding (which could be incorrect) of python C modules is that
they should work the same as the python versions but be more
efficient. The XMLTreeBuilder class in cElementTree doesn't seem to be
using the same parser as that in ElementTree. The following code
illustrates this issue:

>>> import xml.etree.cElementTree
>>> t1=xml.etree.cElementTree.XMLTreeBuilder()
>>> t1._parser.ordered_attributes = 1

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: _parser

>>> import xml.etree.ElementTree
>>> t1=xml.etree.ElementTree.XMLTreeBuilder()
>>> t1._parser.ordered_attributes = 1

In case it is relevant, here is the version and environment
tpadmin at osswlg1{/tpdata/ossgw/config} $ python -V
Python 2.5.1
tpadmin at osswlg1{/tpdata/ossgw/config} $ uname -a
SunOS localhost 5.10 Generic_118833-33 sun4u sparc SUNW,Netra-240

