[Python-Dev] PEP 263 -- Python Source Code Encoding
Fredrik Lundh
fredrik@pythonware.com
Sat, 2 Mar 2002 09:40:43 +0100
Jason wrote:
> The problem I have with PEP 263 right now is that the
> "-*- coding: -*-" magic is really sort of being abused.
really?
> I gather that "coding:" is supposed to specify the
> encoding (what MIME calls "charset") of the file.
> But under PEP 263, it only refers to the Unicode string
> literals within the program. Everything else must still
> be treated as 8-bit text.
from the current version (revision 1.9) of the PEP:
"The complete Python source file should use a single
encoding."
> For example, I'm not sure what effect "coding: utf-16"
> would have. (?)
"Only ASCII compatible encodings are allowed."
> For another example, if you have UTF-8 Unicode string
> literals in your program but you also have 8-bit
> Latin-1 plain str string literals in the same program,
> how should you mark it?
"Embedding of differently encoded data is not
allowed"
> Therefore I argue that it makes no sense to use "coding:" to
> label a Python file, because the file doesn't consist of Unicode
> text.
"the proposed solution should be implemented in two phases:
1. Implement the magic comment detection and default encoding
handling, but only apply the detected encoding to Unicode
literals in the source file.
2. Change the tokenizer/compiler base string type from char* to
Py_UNICODE* and apply the encoding to the complete file."
</F>