<div class="gmail_quote">On 24 September 2012 03:42, Terry Reedy <span dir="ltr"><<a href="mailto:tjreedy@udel.edu" target="_blank">tjreedy@udel.edu</a>></span> wrote:<br>
<blockquote style="BORDER-LEFT:#ccc 1px solid;MARGIN:0px 0px 0px 0.8ex;PADDING-LEFT:1ex" class="gmail_quote">
<div class="im">On 9/23/2012 6:57 PM, Ian Kelly wrote:<br>
<blockquote style="BORDER-LEFT:#ccc 1px solid;MARGIN:0px 0px 0px 0.8ex;PADDING-LEFT:1ex" class="gmail_quote">On Sun, Sep 23, 2012 at 4:24 PM, Joshua Landau<br><<a href="mailto:joshua.landau.ws@gmail.com" target="_blank">joshua.landau.ws@gmail.com</a>> wrote:<br>
<blockquote style="BORDER-LEFT:#ccc 1px solid;MARGIN:0px 0px 0px 0.8ex;PADDING-LEFT:1ex" class="gmail_quote">The docs describe identifiers to have this grammar:<br><br>identifier ::= xid_start xid_continue*<br>id_start ::= <all characters in general categories Lu, Ll, Lt, Lm, Lo,<br>
Nl, the underscore, and characters with the Other_ID_Start property><br>id_continue ::= <all characters in id_start, plus characters in the<br>categories Mn, Mc, Nd, Pc and others with the Other_ID_Continue property><br>
xid_start ::= <all characters in id_start whose NFKC normalization is in<br>"id_start xid_continue*"><br></blockquote></blockquote><br></div>xid_start is a subset of id_start
<div class="im"><br><br>
<blockquote style="BORDER-LEFT:#ccc 1px solid;MARGIN:0px 0px 0px 0.8ex;PADDING-LEFT:1ex" class="gmail_quote">
<blockquote style="BORDER-LEFT:#ccc 1px solid;MARGIN:0px 0px 0px 0.8ex;PADDING-LEFT:1ex" class="gmail_quote">xid_continue ::= <all characters in id_continue whose NFKC normalization is<br>in "id_continue*"><br>
</blockquote></blockquote><br></div>xid_continue is a subset of id_continue.
<div class="im"><br><br>
<blockquote style="BORDER-LEFT:#ccc 1px solid;MARGIN:0px 0px 0px 0.8ex;PADDING-LEFT:1ex" class="gmail_quote">
<blockquote style="BORDER-LEFT:#ccc 1px solid;MARGIN:0px 0px 0px 0.8ex;PADDING-LEFT:1ex" class="gmail_quote">So I would assume that<br> exec("a{} = None".format(char))<br>would be valid if<br> unicodedata.normalize("NFKC", char) == "1"<br>
</blockquote></blockquote><br></div>Read more carefully the definition of xid_continue. The un-normalized character must also be in id_continue.<br></blockquote>
<div>Correct. Thank you for your time.</div>
<div> </div>
<blockquote style="BORDER-LEFT:#ccc 1px solid;MARGIN:0px 0px 0px 0.8ex;PADDING-LEFT:1ex" class="gmail_quote">
<div class="im">
<blockquote style="BORDER-LEFT:#ccc 1px solid;MARGIN:0px 0px 0px 0.8ex;PADDING-LEFT:1ex" class="gmail_quote">
<blockquote style="BORDER-LEFT:#ccc 1px solid;MARGIN:0px 0px 0px 0.8ex;PADDING-LEFT:1ex" class="gmail_quote">as<br> exec("a1 = None")<br>is valid.<br><br>BUT "a¹ = None" is not valid*.<br></blockquote>
</blockquote><br></div>>>> ud.category("\u00b9")<br>'No'<br><br>Category No is *not* in id_continue, and therefore not in xid_continue.
<div class="im"><br><br>
<blockquote style="BORDER-LEFT:#ccc 1px solid;MARGIN:0px 0px 0px 0.8ex;PADDING-LEFT:1ex" class="gmail_quote">exec("x\u00b9 = None") # U+00B9 is superscript 1<br><br>On the other hand, this does work:<br><br>exec("x\u2071 = None") # U+2071 is superscript i<br>
<br>So it seems to be only an issue with superscript and subscript digits.<br> Looks like a compiler bug to me.<br></blockquote><br></div>The problem, if there were one, would be in the tokenizer that finds identifiers. However,
<div class="im"><br><br>>>> exec("x\u00b9 = None")<br></div>...<br> x¹ = None<br> ^<br>SyntaxError: invalid character in identifier<br><br>this is correct.</blockquote>
<div> </div>
<div>Thank you both for helping. The bug is officially closed.</div></div>