[Tutor] converting encoded symbols from rss feed?
Serdar Tumgoren
zstumgoren at gmail.com
Thu Jun 18 22:37:28 CEST 2009
Hey everyone,
I'm trying to get down to basics with this handy intro on Python encodings:
http://eric.themoritzfamily.com/2008/11/21/python-encodings-and-unicode/
But I'm running into some VERY strange results.
On the above link, the section on "Encoding Unicode Byte Streams" has
the following example:
>>> u = u"abc\u2013"
>>> print u
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in
position 3: ordinal not in range(128)
>>> print u.encode("utf-8")
abc–
But when I try the same example on my Windows XP machine (with Python
2.5.4), I can't get the same results. Instead, it spits out the below
(hopefully it renders properly and we don't have encoding issues!!!):
$ python
Python 2.5.4 (r254:67916, Dec 23 2008, 15:10:54) [MSC v.1310 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> x = u"abc\u2013"
>>> print x
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Program Files\Python25\lib\encodings\cp437.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character u'\u2013' in position
3: character maps to <undefined>
>>> x.encode("utf-8")
'abc\xe2\x80\x93'
>>> print x.encode("utf-8")
abcΓÇô
I get the above results in python interpreters invoked from both the
Windows command line and in a cygwin shell. HOWEVER -- the test code
works properly (i.e. I get the expected "abc-" when I run the code in
WingIDE 10.1 (version 3.1.8-1).
In a related test, I was unable change the default character encoding
for the python interpreter from ascii to utf-8. In all cases (cygwin,
Wing IDE, windows command line), the interpreter reported that I my
"sys" module does not contain the "setdefaultencoding" method (even
though this should be part of the module from versions 2.x and above).
Can anyone help me untangle this mess?
I'd be indebted!
More information about the Tutor
mailing list