[Python-3000] Support for PEP 3131

Stephen J. Turnbull stephen at xemacs.org
Fri May 25 13:33:38 CEST 2007

James Y Knight writes:

 > >     - The identifier character set won't spontaneously change when
 > >       one upgrades to a new version of Python, even for users of
 > >       non-ASCII identifiers.
 > FUD. Already won't, unicode explicitly makes that promise. They can  
 > add characters, but not remove them.

Addition is a change, in fact it's the change Ka-Ping dislikes most.

 > >     - Having to specify the table of acceptable characters
 > >       demonstrates at least some knowledge of the character set
 > >       one is using.
 > This is a negative. Why should I have to show knowledge of the  
 > character set I'm using to type the characters?

You don't.  Jim's proposal doesn't specify it, but there should be at
least two built-in tables, ascii (for the stdlib) and unicode
(everything Pythonic in the Identifier classes defined by Unicode).
If you don't want to know, just specify -U unicode.

And if there isn't one, just grab the list off Martin's
"non-normative" table and there you go.

 > >     - It provides the flexibility for different communities to
 > >       to adopt identifier conventions that suit their preferred
 > >       tradeoff of risk vs. expressiveness.
 > Also a negative. Now, if I want to run the modules from multiple  
 > communities I need to figure out how to merge the tables they have to  
 > separately distribute with their modules.

No, you just use -U unicode.

 > a) you trust that the author of the file has authored it correctly,  
 > in which case it doesn't matter one bit what character set they used.  

Which is why 9 out of 10 American viruses recommend Internet Explorer
5 or below.  Because most users *do* trust authors and other
purveyors, including porn sites, etc.

This may be *much less* true of Python users, but I think most
domestic offices of most American corporations would be quite happy to
disable Unicode identifier support at compile time.

 > Restricting the charset at import time is just something to get in  
 > your way with no actual value.

So don't do it; use -U unicode.  I bet Jim J and Josiah and Ka-Ping
will all explicitly use -U ascii, just to make sure.  What's wrong
with that, if that's what they want?

 > Adding baroque command line options for users of other languages to  
 > do some useless verification at import time is not an acceptable  
 > answer. It'd be better to just reject the PEP entirely.

Speaking of exaggeration ....

More information about the Python-3000 mailing list