[Tutor] unicode, utf-8 problem again

Dinesh B Vadhia dineshbvadhia at hotmail.com
Thu Jun 4 18:29:29 CEST 2009


Okay, I get it now ... reading/writing files with the codecs module and the 'utf-8' option fixes it.   Thanks!  



From: Christian Witts 
Sent: Thursday, June 04, 2009 7:05 AM
To: Dinesh B Vadhia 
Cc: tutor at python.org 
Subject: Re: [Tutor] unicode, utf-8 problem again


Dinesh B Vadhia wrote:
> Hi!  I'm processing a large number of xml files that are all declared 
> as utf-8 encoded in the header ie.
>  
> <?xml version="1.0" encoding="UTF-8"?>
>  
> My Python environment has been set for 'utf-8' through site.py.  
> Additionally, the top of each program/module has the declaration:
>  
> # -*- coding: utf-8 -*-
>  
> But, I still get this error:
>  
> Traceback (most recent call last):
> ...
> UnicodeEncodeError: 'ascii' codec can't encode character u'\u201c' in 
> position 76: ordinal not in range(128)
>  
> What am I missing?
>  
> Dinesh
>  
>  
>  
> ------------------------------------------------------------------------
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
>   
Hi,

Take a read through http://evanjones.ca/python-utf8.html which will give 
you insight as to how you should be reading and processing your files.
As for the encoding line "# -*- coding: utf-8 -*-", that is actually to 
declare the character encoding of your script and not of potential data 
it will be working with.

-- 
Kind Regards,
Christian Witts


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20090604/8b3eebe1/attachment.htm>


More information about the Tutor mailing list