On 05/03/2014 01:57 PM, Steven D'Aprano wrote:
On Sat, May 03, 2014 at 11:39:23AM -0400, Ron Adam wrote:
On 05/03/2014 05:05 AM, Steven D'Aprano wrote:
On Sat, May 03, 2014 at 06:38:21PM +1200, Greg Ewing wrote:
>>>Steven D'Aprano wrote:
> > >>>> >Particularly for mathematically-focused code, I think it would be >>>useful > > >>>> >to be able to use identifiers like (say) σ² for variance,
>>>Having σ² be a variable name could be confusing. To a >>>mathematician, it's not a distinct variable, it's >>>just σ ** 2.
Actually, not really. A better way of putting it is that the standard deviation is "just" the square root of σ². Variance comes first (it's defined from first principles), and then the standard deviation is defined by taking the square root.
The main problem I see is that many possible questions come to mind rather than one simple or obvious interpretation.
If I name a variable "x2", what is the "one simple or obvious interpretation" that such an identifier presumably has? If standard, ASCII-only identifiers don't have a single interpretation, why should identifiers like σ² be held to that requirement?
Steven Turnbull pointed out some of the different interpretations I was thinking about in his reply to this message. Mainly that of it being more of a syntactic form, but as you said it also might be interpreted as an identifier spelling.
Like any other identifier, one needs to interpret the name in context. Identifiers can be idiomatic ("i" for a loop variable, "c" for a character), more or less descriptive ("number_of_pages", "npages"), or obfuscated ("e382702"). They can be written in English, or in some other language. They can be ordinary words, or jargon that only means something to those who understand the problem domain. None of this will be different if sub/superscript digits and letters are allowed.
One of the frustrations on this list is how often people hold new proposals to higher standard than existing features. Particularly *impossible* standards. It simply isn't possible for characters like superscript-two to be given a*single* interpretation (although there is an obvious one, namely "squared") any more than it is possible for the letter "a" to be given a*single* interpretation.
There are valid objections to this proposal. It may be that the effort needed to allow code points like ² in identifiers without also allowing ½ or ② may be too great. Or the performance cost is too high. Or the benefit for mathematical-style code doesn't justify adding additional language complexity.
Or even a purely aethetic judgement "I just don't like it". (I don't like identifiers written in cyrillic, because I can't read them, but I'm not the target audience for such identifiers and I will never need to read them. Consequently I don't object if other people use cyrillic identifiers in their personal code.)
Holding this proposal up to an impossible standard which plain ASCII identifiers don't even meet is simply not cricket.
Thank you all for letting me get that off my chest, and apologies to Ron for singling him out.
No problem, you didn't comment on me, but expressed your own thoughts. That's fine. But thanks for clarifying the context of your message, it does help us avoid unintended misunderstandings in message based conversations like these where we don't get to hear the tone of a message.
I feel the same as you describe here in many of these discussions. Enough so that I'm attempting to write a minimal language that uses some of the features I've thought about. The exercise was/is helping me understand many of the lower level language-design patterns in python and some other languages.
Some of the ideas I've wanted just don't fit with pythons design, and some would work, but not without many changes to other parts. And some ideas we can't do because they directly conflict with something we already have. Sigh. The ones that most interest me are the ones that simplify or unify existing features, but those are also the one that are the hardest to do right. ;-)