UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in position 0: invalid start byte

Chris Angelico rosuav at gmail.com
Thu Jul 4 14:37:40 CEST 2013


On Thu, Jul 4, 2013 at 9:52 PM, MRAB <python at mrabarnett.plus.com> wrote:
> On 04/07/2013 12:29, Νίκος wrote:
>>
>> Στις 4/7/2013 1:54 μμ, ο/η Chris Angelico έγραψε:
>>>
>>> On Thu, Jul 4, 2013 at 8:38 PM, ����� <nikos at superhost.gr> wrote:
>>>>
>>>> So you are also suggesting that what gesthostbyaddr() returns is not
>>>> utf-8
>>>> encoded too?
>>>>
>>>> What character is 0xb6 anyways?
>>>
>>>
>>> It isn't. It's a byte. Bytes are not characters.
>>>
>>> http://www.joelonsoftware.com/articles/Unicode.html
>>
>>
>> Well in case of utf-8 encoding for the first 127 codepoing we can safely
>> say that a character equals a byte :)
>>
> Equals? No. Bytes are not characters. (Strictly speaking, they're
> codepoints, not characters.)
>
> And anyway, it's the first _128_ codepoints.

As MRAB says, even if there's a 1:1 correspondence between bytes,
codepoints, and characters, they're still not the same thing. Plus,
0xb6 is not in the first 128, so your statement is false and your
question has no answer. Do you understand why I gave you that link? If
not, go read the page linked to.

ChrisA



More information about the Python-list mailing list