[Python-Dev] PEP 263 -- Python Source Code Encoding
Guido van Rossum
guido@python.org
Tue, 26 Feb 2002 16:53:55 -0500
> In phase 2, the encoding will apply to all strings. So it will not be
> possible to put arbitrary byte sequences in a string literal, atleast
> if the encoding disallows certain byte sequences (like UTF-8, or
> ASCII). Since this is currently possible, we have a backwards
> compatibility problem.
I would say that any program that currently uses non-ASCII in string
literals (whether Unicode or 8-bit literals) is strictly spoken
undefined. For cases where a specific encoding is used, the solution
is easy: add an explicit encoding. Other cases are simply garbage and
should use \xDD escapes instead.
Maybe an implementation phase 1a should be introduced that warns about
the occurrence of non-ASCII characters anywhere in the source code
when no encoding is specified.
--Guido van Rossum (home page: http://www.python.org/~guido/)