Trying to understand this moji-bake
Peter Otten
__peter__ at web.de
Sat Jan 25 03:56:09 EST 2014
Steven D'Aprano wrote:
> I have an unexpected display error when dealing with Unicode strings, and
> I cannot understand where the error is occurring. I suspect it's not
> actually a Python issue, but I thought I'd ask here to start.
I suppose it is a Python issue -- where Python fails to guess an encoding it
usually falls back to ascii.
> But using Python 2.7, I get a really bad case of moji-bake:
>
> [steve at ando ~]$ python2.7 -c "print u'ñøλπйж'"
> ñøλÏйж
>
>
> However, interactively it works fine:
>
> [steve at ando ~]$ python2.7 -E
> Python 2.7.2 (default, May 18 2012, 18:25:10)
> [GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> print u'ñøλπйж'
> ñøλπйж
You can provoke it with exec:
>>> exec "print u'ñøλπйж'"
ñøλÏйж
>>> exec u"print u'ñøλπйж'"
ñøλπйж
>>> exec "# -*- coding: utf-8 -*-\nprint u'ñøλπйж'"
ñøλπйж
> This occurs on at least two different machines, one using Centos and the
> other Debian.
>
> Anyone have any idea what's going on? I can replicate the display error
> using Python 3 like this:
>
> py> s = 'ñøλπйж'
> py> print(s.encode('utf-8').decode('latin-1'))
> ñøλÏйж
>
> but I'm not sure why it's happening at the command line. Anyone have any
> ideas?
It is probably burried in the C code -- after a few indirections I lost
track :(
More information about the Python-list
mailing list