[Python-Dev] Support of UTF-16 and UTF-32 source encodings

Laura Creighton lac at openend.se
Sun Nov 15 09:43:58 EST 2015


In a message of Sun, 15 Nov 2015 12:56:18 +0000, Paul Moore writes:
>On 15 November 2015 at 07:23, Stephen J. Turnbull <stephen at xemacs.org> wrote:
>> I don't see any good reason for allowing non-ASCII-compatible
>> encodings in the reference CPython interpreter.
>
>>From PEP 263:
>
>       Any encoding which allows processing the first two lines in the
>       way indicated above is allowed as source code encoding, this
>       includes ASCII compatible encodings as well as certain
>       multi-byte encodings such as Shift_JIS. It does not include
>       encodings which use two or more bytes for all characters like
>       e.g. UTF-16. The reason for this is to keep the encoding
>       detection algorithm in the tokenizer simple.
>
>So this pretty much confirms that double-byte encodings are not valid
>for Python source files.
>
>Paul

Steve Turnbull, who lives in Japan, and speaks and writes Japanese
is saying that "he cannot see any reason for allowing non-ASCII
compatible encodings in Cpython".

This makes me wonder.

Is this along the lines of 'even in Japan we do not want such
things' or along the lines of 'when in Japan we want such things
we want to so brutally do so much more, so keep the reference
implementation simple, and don't try to help us with this 
seems-like-a-good-idea-but-isnt-in-practice' ideas like this one,
or
....

Laura


More information about the Python-Dev mailing list