On Sat, May 03, 2014 at 11:39:23AM -0400, Ron Adam wrote:
On 05/03/2014 05:05 AM, Steven D'Aprano wrote:
On Sat, May 03, 2014 at 06:38:21PM +1200, Greg Ewing wrote:
Steven D'Aprano wrote:
Particularly for mathematically-focused code, I think it would be
to be able to use identifiers like (say) σ² for variance,
Having σ² be a variable name could be confusing. To a mathematician, it's not a distinct variable, it's just σ ** 2.
Actually, not really. A better way of putting it is that the standard deviation is "just" the square root of σ². Variance comes first (it's defined from first principles), and then the standard deviation is defined by taking the square root.
The main problem I see is that many possible questions come to mind rather than one simple or obvious interpretation.
If I name a variable "x2", what is the "one simple or obvious interpretation" that such an identifier presumably has? If standard, ASCII-only identifiers don't have a single interpretation, why should identifiers like σ² be held to that requirement?
Like any other identifier, one needs to interpret the name in context. Identifiers can be idiomatic ("i" for a loop variable, "c" for a character), more or less descriptive ("number_of_pages", "npages"), or obfuscated ("e382702"). They can be written in English, or in some other language. They can be ordinary words, or jargon that only means something to those who understand the problem domain. None of this will be different if sub/superscript digits and letters are allowed.
One of the frustrations on this list is how often people hold new proposals to higher standard than existing features. Particularly *impossible* standards. It simply isn't possible for characters like superscript-two to be given a *single* interpretation (although there is an obvious one, namely "squared") any more than it is possible for the letter "a" to be given a *single* interpretation.
There are valid objections to this proposal. It may be that the effort needed to allow code points like ² in identifiers without also allowing ½ or ② may be too great. Or the performance cost is too high. Or the benefit for mathematical-style code doesn't justify adding additional language complexity.
Or even a purely aethetic judgement "I just don't like it". (I don't like identifiers written in cyrillic, because I can't read them, but I'm not the target audience for such identifiers and I will never need to read them. Consequently I don't object if other people use cyrillic identifiers in their personal code.)
Holding this proposal up to an impossible standard which plain ASCII identifiers don't even meet is simply not cricket.
Thank you all for letting me get that off my chest, and apologies to Ron for singling him out.