[Python-Dev] Unicode literals in Python 2.7

Guido van Rossum guido at python.org
Wed Apr 29 18:40:43 CEST 2015


I suspect the interactive session is *not* always in UTF8. It probably
depends on the keyboard mapping of your terminal emulator. I imagine in
Windows it's the current code page.

On Wed, Apr 29, 2015 at 9:19 AM, Adam Bartoš <drekin at gmail.com> wrote:

> Yes, that works for eval. But I want it for code entered during an
> interactive session.
>
> >>> u'α'
> u'\xce\xb1'
>
> The tokenizer gets b"u'\xce\xb1'" by calling PyOS_Readline and it knows
> it's utf-8 encoded. But the result of evaluation is u'\xce\xb1'. Because of
> how eval works, I believe that it would work correctly if the
> PyCF_SOURCE_IS_UTF8 was set, but it is not. That is why I'm asking if there
> is a way to set it. Also, my naive thought is that it should be always set
> in the case of interactive session.
>
>
> On Wed, Apr 29, 2015 at 4:59 PM, Victor Stinner <victor.stinner at gmail.com>
> wrote:
>
>> Le 29 avr. 2015 10:36, "Adam Bartoš" <drekin at gmail.com> a écrit :
>> > Why I'm talking about PyCF_SOURCE_IS_UTF8? eval(u"u'\u03b1'") ->
>> u'\u03b1' but eval(u"u'\u03b1'".encode('utf-8')) -> u'\xce\xb1'.
>>
>> There is a simple option to get this flag: call eval() with unicode, not
>> with encoded bytes.
>>
>> Victor
>>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20150429/5402e221/attachment.html>


More information about the Python-Dev mailing list