Right solution to unicode error?

wxjmfauth at gmail.com wxjmfauth at gmail.com
Fri Nov 9 11:06:05 CET 2012


Le jeudi 8 novembre 2012 21:42:58 UTC+1, Ian a écrit :
> On Thu, Nov 8, 2012 at 12:54 PM,  <wxjmfauth at gmail.com> wrote:
> 
> > Font has nothing to do here.
> 
> > You are "simply" wrongly encoding your "unicode".
> 
> >
> 
> >>>> '\u2013'
> 
> > '–'
> 
> >>>> '\u2013'.encode('utf-8')
> 
> > b'\xe2\x80\x93'
> 
> >>>> '\u2013'.encode('utf-8').decode('cp1252')
> 
> > '–'
> 
> 
> 
> No, it seriously is the font.  This is what I get using the default
> 
> ("Raster") font:
> 
> 
> 
> C:\>chcp 65001
> 
> Active code page: 65001
> 
> 
> 
> C:\>c:\python33\python
> 
> Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:48) [MSC v.1600
> 
> 32 bit (Intel)] on win32
> 
> Type "help", "copyright", "credits" or "license" for more information.
> 
> >>> '\u2013'
> 
> '–'
> 
> >>> import sys
> 
> >>> sys.stdout.buffer.write('\u2013\n'.encode('utf-8'))
> 
> –
> 
> 4
> 
> 
> 
> I should note here that the characters copied and pasted do not
> 
> correspond to the glyphs actually displayed in my terminal window.  In
> 
> the terminal window I actually see:
> 
> 
> 
> ΓÇô
> 
> 
> 
> If I change the font to Lucida Console and run the *exact same code*,
> 
> I get this:
> 
> 
> 
> C:\>chcp 65001
> 
> Active code page: 65001
> 
> 
> 
> C:\>c:\python33\python
> 
> Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:48) [MSC v.1600
> 
> 32 bit (Intel)] on win32
> 
> Type "help", "copyright", "credits" or "license" for more information.
> 
> >>> '\u2013'
> 
> '–'
> 
> 
> 
> >>> import sys
> 
> >>> sys.stdout.buffer.write('\u2013\n'.encode('utf-8'))
> 
>> 
> 4
> 
> 
> 
> Why is the font important?  I have no idea.  Blame Microsoft.

---------

If you have something like this 'ΓÇô'; in
Unicode nomenclature:
>>> import unicodedata as ud
>>> for c in 'ΓÇô':
...     ud.name(c)
...     
'GREEK CAPITAL LETTER GAMMA'
'LATIN CAPITAL LETTER C WITH CEDILLA'
'LATIN SMALL LETTER O WITH CIRCUMFLEX'

it is a sign of a "cp437" somewhere.

>>> '\u2013'.encode('utf-8').decode('cp437')
'ΓÇô'

On Windows 7. I do not remember having once a "coding
of the caracters" issue on XP.

jmf



More information about the Python-list mailing list