Encoding of Python 2 string literals
Laura Creighton
lac at openend.se
Wed Jul 22 10:12:49 EDT 2015
In a message of Wed, 22 Jul 2015 22:39:56 +1000, Chris Angelico writes:
>On Wed, Jul 22, 2015 at 8:17 PM, anatoly techtonik <techtonik at gmail.com> wrote:
>> Is there a way to know encoding of string (bytes) literal
>> defined in source file? For example, given that source:
>>
>> # -*- coding: utf-8 -*-
>> from library import Entry
>> Entry("текст")
>>
>> Is there any way for Entry() constructor to know that
>> string "текст" passed into it is the utf-8 string?
>
>I don't think so. However, if you declare that to be a Unicode string,
>the parser will decode it using the declared encoding, and it'll be a
>five-character string. At that point, it doesn't matter what your
>source encoding was, because the characters entered will match the
>characters seen.
>
>Entry(u"текст")
>
>ChrisA
Since you are porting to 3.x, anatoly this will be of interest to you.
https://www.python.org/dev/peps/pep-0414/
Having stuck all the u" into your codebase you won't immediately
have to rip them all out again as long as you use Python 3.3 or above.
Laura
More information about the Python-list
mailing list