On 5/3/2014 12:50 AM, Steven D'Aprano wrote:
On Fri, May 02, 2014 at 10:27:56PM -0400, Terry Reedy wrote:
If the rules for identifiers are expanded, any code the uses newly allowed names cannot be backported or run on previous versions. In contracted, the opposite problem occurs. I do not think they should be changed either way without a strong cause.
That applies to any new feature -- code using that feature cannot be easily backported. In this case, it's actually quite simple to backport code using the new rules for identifiers: just change the identifiers. The algorithm used by the code remains that same.
It appears that I consider lexicography more 'fundamental' in some sense than you do. But lets skip over this.
From 2.3. Identifiers and keywords "The syntax of identifiers in Python is based on the Unicode standard annex UAX-31, with elaboration and changes as defined below; see also PEP 3131 for further details."
Without reading the annex, I cannot tell which part of the 'below' actually defines a 'change', as opposed to an 'elaboration' (explanation). I have no idea whether the unknown changes are additions, deletions, or merely selections of options.
In other words, we use the standard with a few intentional modifications.
Playing Devil's Advocate, perhaps we could add a few more intentional modifications.
Or perhaps not, depending on what the modifications actually are and what the reasons were.
While there are advantages to following a standard just for the sake of following a standard, once you allow any changes, you're no longer following the standard. So the argument becomes, why should we allow that change but not this change?
Nick recently argued, very similarly, that having restored string 'u' prefixes was a reason to restore dict.iterxyz methods. You agreed with me that there were good reasons why B did not follow from A.
To properly compare current and proposed changes, we must know the current 'modifications and changes', their reasons and effects, and the proposed changes and their reasons (any real parallels) and likely effects. If you were to do the research, I would be willing to discuss.
Particularly for mathematically-focused code, I think it would be useful to be able to use identifiers like (say) σ² for variance, g₁ for sample skewness, or β₂ for Pearson's skewness, to give a few real-world examples. Regular digits may be ambiguous: compare s₁² for the sample variance with Bessel's correction, versus s12. (s twelve?)
I agree that there are good uses for this restricted set of additions. Would you allow super/subscripts as prefixes rather than suffixes? I presume not since we already disallow initial numbers.
I'm going to give a tentative +1 vote to allowing superscript and subscript letters and digits in identifiers, if it can be done without excessive cost in complexity or performance.
Would you consider doubling the cost of checking each character (a reasonable estimate, I think) excessive or not?
Anything else, like (say) ⑤ (CIRCLED DIGIT FIVE), I will give a firm -1.