[Python-Dev] PEP 263 - Defining Python Source Code Encodings

M.-A. Lemburg mal@lemburg.com
Sun, 14 Jul 2002 18:21:34 +0200


Martin v. Loewis wrote:
> "Fredrik Lundh" <fredrik@pythonware.com> writes:
> 
> 
>>hmm.  I'm tempted to think that there's a major
>>flaw in the PEP, caused by the fact that
>>
>>    compile(unicode(script, extract_encoding(script)))
>>
>>will, from what I can tell, not compile to the same
>>thing as:
>>
>>    compile(script)
> 
> 
> Can you elaborate what you think the difference is? I believe the PEP
> is silent on this specific aspect,

It does mention this as part of phase 2.

> but I think what should happen is
> (in the Unicode case):
> 
> - compile will convert the script to UTF-8, which is then tokenized.
> - in the process of parsing, the encoding declaration (that presumably
>   extract_encoding was looking at as well) is recognized, if any.
> - Unicode literals are left as-is; byte string literals are converted
>   back to the original encoding.

Right.

> So if there is an encoding declaration in script, then I cannot see a
> difference. If there is none, the PEP does not elaborate what should
> happen. Leaving the byte strings as UTF-8 seems safest, since the only
> way to get "correct" non-ASCII strings without the encoding comment is
> to use the UTF-8 signature.
> 
> In any case, this can't cause backwards compatibility
> problems. compile accepts Unicode strings today only if they can be
> converted to a byte string. In the standard installation, this will
> fail today if there is non-ASCII in script. So allowing Unicode in
> compile is a pure extension. If its precise meaning is underspecified,
> it should be clarified before stage 2 is implemented.

No need for this. The PEP already mentions it.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/