<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">2018-05-18 15:37 GMT+02:00 Steven D'Aprano <span dir="ltr"><<a href="mailto:steve@pearwood.info" target="_blank">steve@pearwood.info</a>></span>:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="gmail-"><br>

</span>Earlier you described this suggestion as "a silly joke".<br>

<br>

<a href="https://mail.python.org/pipermail/python-ideas/2018-May/050861.html" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>pipermail/python-ideas/2018-<wbr>May/050861.html</a></blockquote><div><br><br></div><div>The joke proposal was to write all keywords in Python using bold font variation,<br></div><div>as a reaction to a similar joke proposal to precede all keywords in Python with  \.<br><br></div><div>In contrast this isn't even a proposal, it is merely a description of<br></div><div>an existing feature.<br><br></div><div>Practically speaking, suppose "spam" becomes a keyword in 3.8, and I<br></div><div>have a module which I want to make compatible with 3.8 AND I want<br></div><div>to preserve the API for pre-3.8 versions, then I will first update my module<br></div><div>to use some alternative spelling spam_ throughout, and then, in a single place,<br></div><div>write:<br><br><span class="gmail-">𝐬𝐩𝐚𝐦</span> = spam_  # exploit NFKC normalization to set identifier "spam" for backward compatibility<br><br></div><div>Even if this single line shows up as mojibake in somebody's editor, it shouldn't inconvenience them too much.<br></div><div><br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

I think you were right then.<br>

<span class="gmail-"><br>

<br>

> I am merely defending the status quo.<br>

> I demonstrate how the intended behavior can be achieved using features<br>

> available in current Python versions.<br>

<br>

</span>Aside from the fact that font, editor and keyboard support for such <br>

non-BMP Unicode characters is very spotty, it isn't the intended <br>

behaviour.<br></blockquote><div><br></div><div>I am not sure from what you conclude that.<br><br></div><div>There seem to be three design possibilities here:<br>1.  𝐢𝐟 is an alternative spelling for the keyword if<br>2.  𝐢𝐟 is an identifier<br></div><div>3.  𝐢𝐟 is an error<br><br></div><div>I am pretty sure option 1 (non-ASCII spelling of keywords) was not intended<br></div><div>(doc says about keywords: "They must be spelled

exactly as written here:")<br><br></div><div>So it is either 2 or 3.  Option 3 would only make sense if we conclude that it is<br></div><div>a bad idea to have an identifier with the same name as a keyword.<br></div><div>Whereas this whole thread so far has been about introducing such a feature.<br><br></div><div>So that leaves 2, which happens to be the implemented behavior.<br></div><div><br>As an aside:<br>A general observation of PEP-3131 and Unicode identifiers in Python:<br></div><div>from the PEP it becomes clear that there have been several proposals<br></div><div>of making it more restricted (e.g. requiring source code to be already in<br></div><div>NFKC normal form, which would make 𝐢𝐟 illegal, disallowing confusables,<br></div><div>etc.)<br></div><div><br></div><div>Ultimately this has been rejected and the result is that we have a rather liberal<br></div><div>definition of Unicode identifiers in Python. I feel that 𝐢𝐟  being a valid<br></div><div>identifier fits into that pattern, just as various confusable spellings of if<br></div><div>would be legal identifiers. In theory this could lead to all kinds of<br></div><div>sneaky attacks where code appears to do one thing but does another,<br></div><div>but it just doesn't seem so big an issue in practice.<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

As you point out, the intended behaviour is that obj.𝐢𝐟 and <br>

obj.if ought to be identical. Since the later is a syntax error, so <br>

should be the former.<br></blockquote><div><br></div><div>NFKC normalization is restricted to identifiers. <br>Keywords "must be spelled

exactly as written here."<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<span class="gmail-"><br>

<br>

> It is guaranteed to work by PEP-3131:<br>

> <a href="https://www.python.org/dev/peps/pep-3131" rel="noreferrer" target="_blank">https://www.python.org/dev/<wbr>peps/pep-3131</a><br>

> <br>

> "All identifiers are converted into the normal form NFKC while parsing;<br>

> comparison of identifiers is based on NFKC."<br>

> <br>

> NFKC normalization means spam must be considered the same identifier as<br>

> 𝐬𝐩𝐚𝐦 .<br>

<br>

<br>

</span>It's not the NFKC normalization that I'm questioning. Its the fact that <br>

it is done too late to catch the use of a keyword.<br>

<div class="gmail-HOEnZb"><div class="gmail-h5"><br></div></div></blockquote><div><br></div><div>See above.<br><br></div><div>Stephan<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="gmail-HOEnZb"><div class="gmail-h5">

<br>

-- <br>

Steve<br>

______________________________<wbr>_________________<br>

Python-ideas mailing list<br>

<a href="mailto:Python-ideas@python.org">Python-ideas@python.org</a><br>

<a href="https://mail.python.org/mailman/listinfo/python-ideas" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/python-ideas</a><br>

Code of Conduct: <a href="http://python.org/psf/codeofconduct/" rel="noreferrer" target="_blank">http://python.org/psf/<wbr>codeofconduct/</a><br>

</div></div></blockquote></div><br></div></div>