Re: [Python-Dev] PEP 263 -- Python Source Code Encoding

26 Feb 2002

      ...
...
I missed this.  Why not default to ASCII like any decent programming
language does in the absence of an explicit encoding?
Jack had the same question. The simple answer is: we need this
in order to maintain backward compatibility when we move to
phase two of the implementation.
Here's the longer one:
ASCII is the standard encoding for Python keywords and identifiers. 
There is no standard source code encoding for string literals. 
Unicode literals are interpreted using 'unicode-escape' which 
is an enhanced Latin-1 with escape semantics.
This makes Latin-1 the right choice:
* Unicode literals already use it today
But they shouldn't, IMO.

We should require an explicit encoding when more than ASCII is used,
and I'd like to enforce this.
...
* As soon as we get to phase two of the implementation,
  8-bit string literals will be have to make the round trip
  raw binary -> Unicode -> raw binary and this only works
  if you make Latin-1 the default.
Sorry, I don't understand what you're trying to say here.  Can you
explain this with an example?  Why can't we require any program
encoded in more than pure ASCII to have an encoding magic comment?  I
guess I don't understand why you mean by "raw binary".

Once you've explained it to me, the PEP should address this issue.

--Guido van Rossum (home page: http://www.python.org/~guido/)