[Tutor] Fw: unicode, utf-8 problem again

Dinesh B Vadhia dineshbvadhia at hotmail.com
Thu Jun 4 15:55:06 CEST 2009


I forgot to add that I'm using elementtree to process the xml files and don't (usually) have any problems with that.  Plus, the workaround that works is to encode each elementtree output ie.:

thisxmlline = thisxmlline.encode('utf8')

But, this seems odd to me as isn't it already being processed as utf-8?

Dinesh



From: Dinesh B Vadhia 
Sent: Thursday, June 04, 2009 6:47 AM
To: tutor at python.org 
Subject: unicode, utf-8 problem again


Hi!  I'm processing a large number of xml files that are all declared as utf-8 encoded in the header ie.

<?xml version="1.0" encoding="UTF-8"?>

My Python environment has been set for 'utf-8' through site.py.  Additionally, the top of each program/module has the declaration:

# -*- coding: utf-8 -*-

But, I still get this error:

Traceback (most recent call last):
...
UnicodeEncodeError: 'ascii' codec can't encode character u'\u201c' in position 76: ordinal not in range(128)

What am I missing?

Dinesh


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20090604/7c089810/attachment.htm>


More information about the Tutor mailing list