iterparse and unicode

George Sakkis george.sakkis at gmail.com
Thu Aug 21 00:36:10 CEST 2008


It seems xml.etree.cElementTree.iterparse() is not unicode aware:

>>> from StringIO import StringIO
>>> from xml.etree.cElementTree import iterparse
>>> s = u'<name>\u03a0\u03b1\u03bd\u03b1\u03b3\u03b9\u03ce\u03c4\u03b7\u03c2</name>'
>>> for event,elem in iterparse(StringIO(s)):
...     print elem.text
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 64, in __iter__
UnicodeEncodeError: 'ascii' codec can't encode characters in position
6-15: ordinal not in range(128)

Am I using it incorrectly or it doesn't currently support unicode ?

George



More information about the Python-list mailing list