Problem with sets and Unicode strings
Laurent Pointal
laurent.pointal at limsi.fr
Wed Jun 28 04:02:12 EDT 2006
Dennis Benzinger a écrit :
> No, byte strings contain characters which are at least 8-bit wide
> <http://docs.python.org/ref/types.html>. But I don't understand what
> Python is trying to decode and why the exception says something about
> the ASCII codec, because my file is encoded with UTF-8.
[addendum to others replies]
The file encoding directive is used by Python to convert u"xxx" strings
into unicode objects using right conversion rules when compiling the code.
When a string is written simply with "xxx", its a 8 bits string with NO
encoding data associated. When these strings must be converted they are
considered to be using sys.getdefaultencoding() [generally ascii -
forced ascii in python 2.5]
So a short reply: the utf8 directive has no effect on 8 bits strings,
use unicode strings to manage correctly non-ascii texts.
A+
Laurent.
More information about the Python-list
mailing list