[Python-ideas] Allow additional separator character in variables

Stephan Houben stephanh42 at gmail.com
Tue Nov 21 07:50:03 EST 2017


2017-11-21 12:55 GMT+01:00 Stephen J. Turnbull <
turnbull.stephen.fw at u.tsukuba.ac.jp>:

> Personally, I think that Python probably should ban non-ASCII
> non-letter characters in identifiers and whitespace, and maybe add
> them later in response to requests from native speakers of the
> relevant languages.


That would be quite a backward-incompatible change since
such identifiers have been legal since Python 3.0.


> I don't know how easy that would be to do,
> though, since I think the rule is already that identifiers must be
> composed only of letters, numbers, and ASCII "_".


See:
https://www.python.org/dev/peps/pep-313

The identifier syntax is <XID_Start> <XID_Continue>*.

ID_Start is defined as all characters having one of the general categories
uppercase letters (Lu), lowercase letters (Ll), titlecase letters (Lt),
modifier letters (Lm), other letters (Lo), letter numbers (Nl), the
underscore, and characters carrying the Other_ID_Start property. XID_Start
then closes this set under normalization, by removing all characters whose
NFKC normalization is not of the form ID_Start ID_Continue* anymore.

ID_Continue is defined as all characters in ID_Start, plus nonspacing marks
(Mn), spacing combining marks (Mc), decimal number (Nd), connector
punctuations (Pc), and characters carryig the Other_ID_Continue property.
Again, XID_Continue closes this set under NFKC-normalization; it also adds
U+00B7 to support Catalan.

Since Serhiy's
> examples are valid, we'd have to rule them out explicitly, rather than
> by reference to the Unicode database.  Yuck.
>

If we take this thinking to its logical extreme we should ban ASCII 1 and l
since they can be confused. Also 0 and O.

Realistically, this is extremely unlikely to be an issue in practice.
If you have people making such malignant code changes
with checkin permission, you have bigger problems...

Anyway, you can have your linter enforce ASCII or whatever
character subset you deem safe.

Stephan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20171121/f2fad3ca/attachment.html>


More information about the Python-ideas mailing list