UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in position 0: invalid start byte

Νίκος nikos at superhost.gr
Thu Jul 4 14:06:47 CEST 2013

Στις 4/7/2013 2:52 μμ, ο/η MRAB έγραψε:
> On 04/07/2013 12:29, Νίκος wrote:
>> Στις 4/7/2013 1:54 μμ, ο/η Chris Angelico έγραψε:
>>> On Thu, Jul 4, 2013 at 8:38 PM, ����� <nikos at superhost.gr> wrote:
>>>> So you are also suggesting that what gesthostbyaddr() returns is not
>>>> utf-8
>>>> encoded too?
>>>> What character is 0xb6 anyways?
>>> It isn't. It's a byte. Bytes are not characters.
>>> http://www.joelonsoftware.com/articles/Unicode.html
>> Well in case of utf-8 encoding for the first 127 codepoing we can safely
>> say that a character equals a byte :)
> Equals? No. Bytes are not characters. (Strictly speaking, they're
> codepoints, not characters.)
> And anyway, it's the first _128_ codepoints.

Yes 0-127 = 128, i knew that!

Well the relationship between characters and bytes is that:

A [0-127] Unicode codepoints(characters) need 1-byte to be stored in 
utf-8 encoding.

I think its also correct to say that the byte in the above situation is 
the representation of our character.

What is now proved was at first only imagined!

More information about the Python-list mailing list