UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 10442: character maps to <undefined>
Marko Rauhamaa
marko at pacujo.net
Sun Oct 21 14:48:04 EDT 2018
pjmclenon at gmail.com:
> not sure why utf-8 gives an error when thats the most wide all caracters
> inclusive right?/
Not all sequences of bytes are legal in UTF-8. For example,
>>> b'\x80'.decode("utf-8")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
Not all sequences of bytes are legal in ASCII, either.
However, all sequences of bytes are legal in Latin-1 (among others). Of
course, decoding with Latin-1 gives you gibberish unless the data really
is Latin-1. But you'll never get a UnicodeDecodeError.
Marko
More information about the Python-list
mailing list