[Python-3000] Support for PEP 3131

Guido van Rossum guido at python.org
Fri May 25 05:09:33 CEST 2007


On 5/24/07, Ka-Ping Yee <python at zesty.ca> wrote:
> To pit this as "ascii lovers vs. non-ascii lovers" is a pretty large
> oversimplification.  You could name them "people who want to be able
> to know what the code says" and "people who don't mind not being able
> to know what the code says".  Or you could name them "people who want
> Python's lexical syntax to be something they fully understand" and
> "people who don't mind the extra complexity".  Or "people who don't
> want Python's lexical syntax to be tied to a changing external
> standard" and "people who don't mind the extra variability."
>
> However you characterize them, keep in mind that those in the former
> group are asking for default behaviour that 100% of Python users
> already use and understand.  There's no cost to keeping identifiers
> ASCII-only because that's what Python already does.
>
> I think that's a pretty strong reason for making the new, more complex
> behaviour optional.

If there's a security argument to be made for restricting the alphabet
used by code contributions (even by co-workers at the same company), I
don't see why ASCII-only projects should have it easier than projects
in other cultures.

It doesn't look like any kind of global flag passed to the interpreter
would scale -- once I am using a known trusted contribution that uses
a different character set than mine, I would have to change the global
setting to be more lenient, and the leniency would affect all code I'm
using.

A more useful approach would seem to be a set of auditing tools that
can be applied routinely to all new contributions (e.g. as a
pre-commit hook when using a source control system), or to all code in
a given directory, download, etc. I don't see this as all that
different from using e.g. PyChecker of PyLint.

While I routinely perform visual code inspections (code review is the
law at Google, and I wrote the tool used internally to do these), I
certainly don't see this as a security audit -- I use it as a
mentoring activity and to reach agreement over issues as diverse as
coding style, architecture and implementation techniques between
trusting colleagues. Scanning for stray non-ASCII characters is best
left to automated tools.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-3000 mailing list