[Tutor] unicode help

Sander Sweers sander.sweers at gmail.com
Wed Nov 14 23:09:40 CET 2012


Marilyn Davis schreef op wo 14-11-2012 om 13:23 [-0800]:
> I found this site:
> http://hints.macworld.com/article.php?story=20100713130450549
> 
> and that fixes it.

Short answer: It is not a fix but a workaround. Try:
print symbol.encode('utf-8')

Longer answer: It is not really a fix, it is a workaround for the
implicit encode/decode python does. Setting this environment variable
will make the errors go away for *you* but what about the other people
running the code. What you are seeing is the *implicit* encode from a
unicode string to a byte string using sys.getdefaultencoding. What you
need to do is encode your *unicode* string to a *byte* string by using
encode() method. You should watch/read
http://nedbatchelder.com/text/unipain.html and see how to fix this
properly.

This quick interactive session shows how this error is properly solved.

$ python
Python 2.7.3 (default, Sep 22 2012, 18:13:33) 
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.stdout.encoding
'ascii'
>>> unicode_symbol = unichr(165)
>>> print unicode_symbol
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa5' in
position 0: ordinal not in range(128)
>>> byte_symbol = unicode_symbol.encode('utf-8')
>>> print byte_symbol
¥

I do not use macosx but I suspect it will support UTF-8, however if not
you need to find an encoding that is supported and has your character.

This can be quite confusing so I really suggest strongly to watch Ned's
talk ;-).

Greets
Sander



More information about the Tutor mailing list