[Python-Dev] Minidom and Unicode

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Sat, 1 Jul 2000 09:47:20 +0200


While trying the minidom parser from the current CVS, I found that
repr apparently does not work for nodes:

Python 2.0b1 (#29, Jun 30 2000, 10:48:11)  [GCC 2.95.2 19991024 (release)] on linux2
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
Copyright 1995-2000 Corporation for National Research Initiatives (CNRI)
>>> from xml.dom.minidom import parse
>>> d=parse("/usr/src/python/Doc/tools/sgmlconv/conversion.xml")
>>> d.childNodes
[Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: __repr__ returned non-string (type unicode)

The problem here is that __repr__ is computed as

    def __repr__( self ):
        return "<DOM Element:"+self.tagName+" at "+`id( self )` +" >"

and that self.tagName is u'conversion', so the resulting string is a
unicode string.

I'm not sure whose fault that is: either __repr__ should accept
unicode strings, or minidom.Element.__repr__ should be changed to
return a plain string, e.g. by converting tagname to UTF-8. In any
case, I believe __repr__ should 'work' for these objects.

Regards,
Martin