[Python-3000] Support for PEP 3131

Sun Jun 10 21:40:08 CEST 2007

On 6/10/07, "Martin v. Löwis" <martin at v.loewis.de> wrote:
> > Many of us value a *predictable* identifier character set.
> > Whether "predictable" means ASCII only, or user-selectable, or
> > restricted by default, I think we all agree in this sentiment:

> Indeed, PEP 3131 gives a predictable identifier character set.
> Adding per-site options to change the set of allowable characters
> makes it less predictable.

Not in practice.

Today, identifiers are drawn from [A-Za-z0-9], which is a fairly small set.

Under the current PEP 3131 proposal, they will be drawn from a much
larger set.  There won't normally be many more letters actually used
in any given program, but there will be many more that are possible
(with very low probability).  Unfortunately, some of these are
visually identical.  (Even with modified XID, they don't get rid of
confusables; they unicode consortium is very unwilling to rule out
anything which might theoretically be needed for valid reasons.)  Many
more are visually indistinguishable in practice, simply because the
reader hasn't seen them before.  While Unicode is still a finite set,
it is much larger than ASCII.

By allowing site modifications, the rule becomes:

It will use ASCII.

Local code can also use local characters.

There are potential exceptions for code that gets shared beyond local
groups without ASCII-fication, but this is a strict subset of the
"unreadable" code used under "anything-goes".  Distribution without
ASCIIfication is discouraged (by the extra decision required at
installation time), users have explicit notice (by accepting it at
install time), and the expanded charset is still a tiny fraction of
what PEP3131 currently proposes (you can accept French characters
withough accepting Chinese ideographs).

-jJ