[Tutor] Assistance with UnicodeDecodeError

James Chapman james at uplinkzero.com
Wed Feb 4 18:14:00 CET 2015


Actually, it's more likely that the char you are grabbing is UTF-16 not
UTF-8 which is moving into the double byte...
* An assumption based on the following output:

>>> u = u'\u2014'
>>> s = u.encode("utf-16")
>>> print(s)
 ■¶
>>> s = u.encode("utf-32")
>>> print(s)
 ■  ¶
>>> s = u.encode("utf-16LE")
>>> print(s)
¶
>>> s = u.encode("utf-16BE")
>>> print(s)
 ¶

See https://en.wikipedia.org/wiki/Character_encoding to help with the
understanding of character encoding, code pages and why they are important.





James


More information about the Tutor mailing list