[Python-3000] Support for PEP 3131
James Y Knight
foom at fuhm.net
Thu May 24 23:47:45 CEST 2007
On May 24, 2007, at 5:04 PM, Ka-Ping Yee wrote:
>> (1) By default, python allows only ASCII.
>> (2) Additional characters are permitted if they appear in a table
>> named on the command line.
> +1! This is a fine solution. It is better than the "python -U"
> option I proposed -- it has all the advantages of that proposal, plus:
> - The identifier character set won't spontaneously change when
> one upgrades to a new version of Python, even for users of
> non-ASCII identifiers.
FUD. Already won't, unicode explicitly makes that promise. They can
add characters, but not remove them.
> - Having to specify the table of acceptable characters
> demonstrates at least some knowledge of the character set
> one is using.
This is a negative. Why should I have to show knowledge of the
character set I'm using to type the characters?
> - It provides the flexibility for different communities to
> to adopt identifier conventions that suit their preferred
> tradeoff of risk vs. expressiveness.
Also a negative. Now, if I want to run the modules from multiple
communities I need to figure out how to merge the tables they have to
separately distribute with their modules.
> Jim's proposal appears to be the best path to making everyone happy.
Nope. It does nobody any good. It may make people who fear non-ascii
code happy, but only because it totally castrates this feature for
people who do want to use non-ascii identifiers.
It really seems to me people are spewing a lot of FUD here. Rejecting
certain characters when loading a file is simply not necessary.
a) you trust that the author of the file has authored it correctly,
in which case it doesn't matter one bit what character set they used.
Restricting the charset at import time is just something to get in
your way with no actual value.
b) you don't trust the code, and want to inspect it.
Okay, in this case you actually have to inspect the *code* --
checking the character set is an utterly useless thing to do by
itself. It tells you nothing useful.
While checking the code, you may want to have strange characters
outside your comfort range flagged for you. Either grep or editor
support are a simple enough solution for this. Or, let's say your
editor is unable to highlight suspicious characters, and you want to
find identifiers with strange characters, and not get tripped up on
comments. Fine, make a tool that uses the compiler.parser module to
iterate over identifiers in the source code.
Adding baroque command line options for users of other languages to
do some useless verification at import time is not an acceptable
answer. It'd be better to just reject the PEP entirely.
More information about the Python-3000