[Python-ideas] Verbatim names (allowing keywords as names)

Fri May 18 10:31:49 EDT 2018

2018-05-18 15:37 GMT+02:00 Steven D'Aprano <steve at pearwood.info>:

>
> Earlier you described this suggestion as "a silly joke".
>
> https://mail.python.org/pipermail/python-ideas/2018-May/050861.html

The joke proposal was to write all keywords in Python using bold font
variation,
as a reaction to a similar joke proposal to precede all keywords in Python
with  \.

In contrast this isn't even a proposal, it is merely a description of
an existing feature.

Practically speaking, suppose "spam" becomes a keyword in 3.8, and I
have a module which I want to make compatible with 3.8 AND I want
to preserve the API for pre-3.8 versions, then I will first update my module
to use some alternative spelling spam_ throughout, and then, in a single
place,
write:

𝐬𝐩𝐚𝐦 = spam_  # exploit NFKC normalization to set identifier "spam" for
backward compatibility

Even if this single line shows up as mojibake in somebody's editor, it
shouldn't inconvenience them too much.

> I think you were right then.
>
>
> > I am merely defending the status quo.
> > I demonstrate how the intended behavior can be achieved using features
> > available in current Python versions.
>
> Aside from the fact that font, editor and keyboard support for such
> non-BMP Unicode characters is very spotty, it isn't the intended
> behaviour.
>

I am not sure from what you conclude that.

There seem to be three design possibilities here:
1.  𝐢𝐟 is an alternative spelling for the keyword if
2.  𝐢𝐟 is an identifier
3.  𝐢𝐟 is an error

I am pretty sure option 1 (non-ASCII spelling of keywords) was not intended
(doc says about keywords: "They must be spelled exactly as written here:")

So it is either 2 or 3.  Option 3 would only make sense if we conclude that
it is
a bad idea to have an identifier with the same name as a keyword.
Whereas this whole thread so far has been about introducing such a feature.

So that leaves 2, which happens to be the implemented behavior.

As an aside:
A general observation of PEP-3131 and Unicode identifiers in Python:
from the PEP it becomes clear that there have been several proposals
of making it more restricted (e.g. requiring source code to be already in
NFKC normal form, which would make 𝐢𝐟 illegal, disallowing confusables,
etc.)

Ultimately this has been rejected and the result is that we have a rather
liberal
definition of Unicode identifiers in Python. I feel that 𝐢𝐟  being a valid
identifier fits into that pattern, just as various confusable spellings of
if
would be legal identifiers. In theory this could lead to all kinds of
sneaky attacks where code appears to do one thing but does another,
but it just doesn't seem so big an issue in practice.

> As you point out, the intended behaviour is that obj.𝐢𝐟 and
> obj.if ought to be identical. Since the later is a syntax error, so
> should be the former.
>

NFKC normalization is restricted to identifiers.
Keywords "must be spelled exactly as written here."

>
>
> > It is guaranteed to work by PEP-3131:
> > https://www.python.org/dev/peps/pep-3131
> >
> > "All identifiers are converted into the normal form NFKC while parsing;
> > comparison of identifiers is based on NFKC."
> >
> > NFKC normalization means spam must be considered the same identifier as
> > 𝐬𝐩𝐚𝐦 .
>
>
> It's not the NFKC normalization that I'm questioning. Its the fact that
> it is done too late to catch the use of a keyword.
>
>
See above.

Stephan

>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180518/cd80efa5/attachment-0001.html>