Allow attribute references for decimalinteger

There's an unnecessary corner case regarding integer literals and attributes:
Why does it fail? To my human eyes it seems (almost) completely unambiguous. Is it a lexing thing? I couldn't find an explanation (or any reference at all, although there should be) in the docs. The only ambiguity I can see is 1.j - but the desired meaning is clear: 1j (just like 1.0j). It may confuse beginners; it made me believe that there's no such thing as 1.__class__ a couple of years ago. I guess it's bad for code generators too. I suggest making "1.identifier" legal, and adding `j` and `J` properties to numbers.Number to mean the sensible thing (so 0.j is not a special syntax as it today). Elazar

On Wed, Oct 30, 2013 at 9:01 PM, אלעזר <elazarg@gmail.com> wrote:
Yes. The first . following a digit makes into the same float token. To make . its own token the number must be complete by the time the tokenizer sees it. I don't think your proposal is implementable without making parser significantly more complicated.
The only ambiguity I can see is 1.j ..
What would 1.e50 mean under your proposal? Currently we have
1.e50 1e+50

On 31/10/2013 01:01, אלעזר wrote:
It'll make the lexer more complicated if it can't tell whether the "." is part of a float literal or not, and, anyway, it's already the case that 1.j == 1.0j, not (1).j, so saying that 1.real == (1).real would be inconsistent.
I'd prefer it if Python insisted that a decimal point be preceded and followed by a digit, but changing it might break existing code. It's one of those changes that could've been made in Python 3, I suppose, but it's not something I'm losing any sleep over! :-)

[MRAB]
it's already the case> that 1.j == 1.0j, not (1).j, so saying that 1.real == (1).real would be inconsistent.
I don't understand. In what sense 1.j != (1).j ? The latter is an AttributeError, which I suggest to change. [Alexander Belopolsky]
What would 1.e50 mean under your proposal?
Well that seems to kill it :-(. I should have digged deeper before proposing this idea. I assume it's fixable by making it a special syntax itself, or by raising an AttributeError (I really doubt it appears anywhere) or by making int's __getattribute__ handle it in a special way, but I wouldn't suggest these ideas seriously. The docs should mention this issue, however. perhaps in an end note.

On Fri, Nov 1, 2013 at 1:35 AM, Philipp A. <flying-sheep@web.de> wrote:
".1" is not under challenge (the requirement suggested is merely a digit after, nothing about before). I agree that "1." for 1.0 is a useful shorthand, but if the parser had originally been written to disallow it, I doubt there'd be very strong call to make it more free. ChrisA

On Thu, Oct 31, 2013 at 04:10:46PM +0100, Ronald Oussoren wrote:
Then you might prefer writing 1.000000000000000000000000000000000 for even more readability ;-) But seriously... accepting "1." for float 1.0 is, as far as I can tell, a common but not universal practice in programming languages. Being tolerant of missing zeroes before or after the decimal point matches how most people write, in my experience: dropping the leading zero is very common, trailing zero less so, unless they drop the decimal point as well. In a language like Python with distinct int and float types, but no declarations, dropping the decimal point gives you an int. Not much can be done about that. As far as other languages go, ISO Pascal requires a digit after the dot, but in my experience most Pascal compilers accept a bare dot: gpc compiles "x := 23.;" but issues a warning. C appears to allow it: gcc compiles "double foo = 23.;" without a warning. Ruby no longer accepts floats with a leading or trailing dot. Personally, I would never use "1." in source code, but I might use it in the interactive interpreter. I think it's an issue for a linter, not the compiler, so I'm happy with the status quo. -- Steven

On Fri, Nov 1, 2013 at 8:44 PM, Steven D'Aprano <steve@pearwood.info> wrote:
And that's exactly why the clarity has value - because of the extreme visual similarity between "1" (int) and "1." (float). However...
... I agree. Having the flexibility is handy, same as we have the flexibility to write less-clear code in other ways. And as to the exponent, I don't recall ever writing "1.e10" intentionally, but I definitely write "1e10", and would be extremely surprised if a loose dot broke that. ChrisA

Philipp A. wrote:
I meant that it would be better for the lexer, as it would remove the ambiguity. Personally I prefer to write '1.0' rather than '1.' in the interests of readability, so it wouldn't bother me, but I understand that others may feel differently. Having said that, nowadays I'm not sure that there is much reason to ever write '1.' rather than just '1', since ints get promoted to floats in most contexts where it's necessary. This is especially true now that the division operator works sanely. -- Greg

On Wed, Oct 30, 2013 at 9:01 PM, אלעזר <elazarg@gmail.com> wrote:
Yes. The first . following a digit makes into the same float token. To make . its own token the number must be complete by the time the tokenizer sees it. I don't think your proposal is implementable without making parser significantly more complicated.
The only ambiguity I can see is 1.j ..
What would 1.e50 mean under your proposal? Currently we have
1.e50 1e+50

On 31/10/2013 01:01, אלעזר wrote:
It'll make the lexer more complicated if it can't tell whether the "." is part of a float literal or not, and, anyway, it's already the case that 1.j == 1.0j, not (1).j, so saying that 1.real == (1).real would be inconsistent.
I'd prefer it if Python insisted that a decimal point be preceded and followed by a digit, but changing it might break existing code. It's one of those changes that could've been made in Python 3, I suppose, but it's not something I'm losing any sleep over! :-)

[MRAB]
it's already the case> that 1.j == 1.0j, not (1).j, so saying that 1.real == (1).real would be inconsistent.
I don't understand. In what sense 1.j != (1).j ? The latter is an AttributeError, which I suggest to change. [Alexander Belopolsky]
What would 1.e50 mean under your proposal?
Well that seems to kill it :-(. I should have digged deeper before proposing this idea. I assume it's fixable by making it a special syntax itself, or by raising an AttributeError (I really doubt it appears anywhere) or by making int's __getattribute__ handle it in a special way, but I wouldn't suggest these ideas seriously. The docs should mention this issue, however. perhaps in an end note.

On Fri, Nov 1, 2013 at 1:35 AM, Philipp A. <flying-sheep@web.de> wrote:
".1" is not under challenge (the requirement suggested is merely a digit after, nothing about before). I agree that "1." for 1.0 is a useful shorthand, but if the parser had originally been written to disallow it, I doubt there'd be very strong call to make it more free. ChrisA

On Thu, Oct 31, 2013 at 04:10:46PM +0100, Ronald Oussoren wrote:
Then you might prefer writing 1.000000000000000000000000000000000 for even more readability ;-) But seriously... accepting "1." for float 1.0 is, as far as I can tell, a common but not universal practice in programming languages. Being tolerant of missing zeroes before or after the decimal point matches how most people write, in my experience: dropping the leading zero is very common, trailing zero less so, unless they drop the decimal point as well. In a language like Python with distinct int and float types, but no declarations, dropping the decimal point gives you an int. Not much can be done about that. As far as other languages go, ISO Pascal requires a digit after the dot, but in my experience most Pascal compilers accept a bare dot: gpc compiles "x := 23.;" but issues a warning. C appears to allow it: gcc compiles "double foo = 23.;" without a warning. Ruby no longer accepts floats with a leading or trailing dot. Personally, I would never use "1." in source code, but I might use it in the interactive interpreter. I think it's an issue for a linter, not the compiler, so I'm happy with the status quo. -- Steven

On Fri, Nov 1, 2013 at 8:44 PM, Steven D'Aprano <steve@pearwood.info> wrote:
And that's exactly why the clarity has value - because of the extreme visual similarity between "1" (int) and "1." (float). However...
... I agree. Having the flexibility is handy, same as we have the flexibility to write less-clear code in other ways. And as to the exponent, I don't recall ever writing "1.e10" intentionally, but I definitely write "1e10", and would be extremely surprised if a loose dot broke that. ChrisA

Philipp A. wrote:
I meant that it would be better for the lexer, as it would remove the ambiguity. Personally I prefer to write '1.0' rather than '1.' in the interests of readability, so it wouldn't bother me, but I understand that others may feel differently. Having said that, nowadays I'm not sure that there is much reason to ever write '1.' rather than just '1', since ints get promoted to floats in most contexts where it's necessary. This is especially true now that the division operator works sanely. -- Greg
participants (8)
-
Alexander Belopolsky
-
Chris Angelico
-
Greg Ewing
-
MRAB
-
Philipp A.
-
Ronald Oussoren
-
Steven D'Aprano
-
אלעזר