[Tutor] Assistance with UnicodeDecodeError
James Chapman
james at uplinkzero.com
Wed Feb 4 18:14:00 CET 2015
Actually, it's more likely that the char you are grabbing is UTF-16 not
UTF-8 which is moving into the double byte...
* An assumption based on the following output:
>>> u = u'\u2014'
>>> s = u.encode("utf-16")
>>> print(s)
■¶
>>> s = u.encode("utf-32")
>>> print(s)
■ ¶
>>> s = u.encode("utf-16LE")
>>> print(s)
¶
>>> s = u.encode("utf-16BE")
>>> print(s)
¶
See https://en.wikipedia.org/wiki/Character_encoding to help with the
understanding of character encoding, code pages and why they are important.
James
More information about the Tutor
mailing list