PEP: Defining Unicode Literal Encodings

Paul Prescod paulp at ActiveState.com
Fri Jul 13 21:03:36 CEST 2001


"M.-A. Lemburg" wrote:
> 
> Please comment...

I think that there should be a single directive for:

 * unicode strings
 * 8-bit strings
 * comments

If a user uses UTF-8 for 8-bit strings and Shift-JIS for Unicode, there
is basically no text editor in the world that is going to do the right
thing. And it isn't possible for a web server to properly associate an
encoding. In general, it isn't a useful configuration.

Also, no matter what the directive says, I think that \uXXXX should
continue to work. Just as in 8-bit strings, it should be possible to mix
and match direct encoded input and backslash-escaped characters.
Sometimes one is convenient (because of your keyboard setup) and
sometimes the other is convenient. This proposal exists only to improve
typing convenience so we should go all the way and allow both.

I strongly think we should restrict the directive to one per file and in
fact I would say it should be one of the first two lines. It should be
immediately following the shebang line if there is one. This is to allow
text editors to detect it as they detect XML encoding declarations.

My opinions are influenced by the fact that I've helped implement
Unicode support in an Python/XML editor. XML makes it easy to give the
user a good experience. Python could too if we are careful.

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook




More information about the Python-list mailing list