Right solution to unicode error?

Oscar Benjamin oscar.j.benjamin at gmail.com
Thu Nov 8 22:37:48 CET 2012

On 8 November 2012 19:54,  <wxjmfauth at gmail.com> wrote:
> Le jeudi 8 novembre 2012 19:49:24 UTC+1, Ian a écrit :
>> On Thu, Nov 8, 2012 at 11:32 AM, Oscar Benjamin
>>
>> <oscar.j.benjamin at gmail.com> wrote:
>>
>> > If I want the other characters to work I need to change the code page:
>>
>> >
>>
>> > O:\>chcp 65001
>>
>> > Active code page: 65001
>>
>> >
>>
>> > O:\>Q:\tools\Python33\python -c "import sys;
>>
>> I find that I also need to change the font.  With the default font,
>>
>> printing '\u2013' gives me:
>>
>> â€“
>>
>>
>>
>> The only alternative font option I have in Windows XP is Lucida
>>
>> Console, which at least works correctly, although it seems to be
>>
>> lacking a lot of glyphs.
>
> Font has nothing to do here.
> You are "simply" wrongly encoding your "unicode".
>
>>>> '\u2013'
> '–'
>>>> '\u2013'.encode('utf-8')
> b'\xe2\x80\x93'
>>>> '\u2013'.encode('utf-8').decode('cp1252')
> 'â€“'

You have correctly identified that the displayed characters are the
result of accidentally interpreting utf-8 bytes as if they were cp1252
or similar. However, it is not Ian or Python that is confusing the
encoding. It is cmd.exe that is confusing the encoding in a
font-dependent way. I also had to change the font as Ian describes
though I did it some time ago and forgot to mention it here.

jmf, can you please trim the text you quote removing the parts you are
not responding to and then any remaining blank lines that were