[Tutor] UnicodeEncodeError

Albert-Jan Roskam fomcl at yahoo.com
Wed Nov 25 14:44:24 CET 2009

I'm parsing an xml file using elementtree, but it seems to get stuck on certain non-ascii characters (for example: "ê"). I'm using Python 2.4. Here's the relevant code fragment:
for element in doc.getiterator():
    m = re.match(search_text, str(element.text))
  except UnicodeEncodeError:
    raise # I want to get rid of this exception.

    m = re.match(search_text, str(element.text))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xea' in position 4: ordinal not in range(128)
How can I get rid of this unicode encode error. I tried:
s = str(element.text)
(and then feeding it into the regex)
The xml file is in UTF-8. Somehow I need to tell the program not to use ascii but utf-8, right?
Thanks in advance!


In the face of ambiguity, refuse the temptation to guess.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20091125/32965c58/attachment.htm>

More information about the Tutor mailing list