Cannot declare the largest integer literal.
i = -2147483648 OverflowError: integer literal too large i = -2147483648L int(i) # it *is* a valid integer literal -2147483648
As far as I traced back: Python/compile.c::com_atom() calls Python/compile.c::parsenumber(s = "2147483648") calls Python/mystrtoul.c::PyOS_strtol() which returns the ERANGE errno because it is given 2147483648 (which *is* out of range) rather than -2147483648. My question: Why is the minus sign not considered part of the "atom", i.e. the integer literal? Should it be? PyOS_strtol() can properly parse this integer literal if it is given the whole number with the minus sign. Otherwise the special case largest negative number will always erroneously be considered out of range. I don't know how the tokenizer works in Python. Was there a design decision to separate the integer literal and the leading sign? And was the effect on functions like PyOS_strtol() down the pipe missed? Trent -- Trent Mick trentm@activestate.com
i = -2147483648 OverflowError: integer literal too large i = -2147483648L int(i) # it *is* a valid integer literal -2147483648
I struck this years ago! At the time, the answer was "yes, its an implementation flaw thats not worth fixing". Interestingly, it _does_ work as a hex literal:
0x80000000 -2147483648 -2147483648 Traceback (OverflowError: integer literal too large
Mark.
[Trent Mick]
i = -2147483648 OverflowError: integer literal too large i = -2147483648L int(i) # it *is* a valid integer literal -2147483648
Python's grammar is such that negative integer literals don't exist; what you actually have there is the unary minus operator applied to positive integer literals; indeed,
def f(): return -42
import dis dis.dis(f) 0 SET_LINENO 1
3 SET_LINENO 2 6 LOAD_CONST 1 (42) 9 UNARY_NEGATIVE 10 RETURN_VALUE 11 LOAD_CONST 0 (None) 14 RETURN_VALUE
Note that, at runtime, the example loads +42, then negates it: this wart has deep roots!
... And was the effect on functions like PyOS_strtol() down the pipe missed?
More that it was considered an inconsequential endcase. It's sure not worth changing the grammar for <wink>. I'd rather see Python erase the visible distinction between ints and longs.
Tim Peters wrote:
[Trent Mick]
i = -2147483648 OverflowError: integer literal too large i = -2147483648L int(i) # it *is* a valid integer literal -2147483648
Python's grammar is such that negative integer literals don't exist; what you actually have there is the unary minus operator applied to positive integer literals; indeed,
<disassembly snipped> Well, knowing that there are more negatives than positives and then coding it this way appears in fact as a design flaw to me. A simple solution could be to do the opposite: Always store a negative number and negate it for positive numbers. A real negative number would then end up with two UNARY_NEGATIVE opcodes in sequence. If we had a simple postprocessor to remove such sequences at the end, we're done. As another step, it could also adjust all such consts and remove those opcodes. This could be a task for Skip's peephole optimizer. Why did it never go into the core? ciao - chris -- Christian Tismer :^) mailto:tismer@appliedbiometrics.com Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com
[Tim]
Python's grammar is such that negative integer literals don't exist; what you actually have there is the unary minus operator applied to positive integer literals; ...
[Christian Tismer]
Well, knowing that there are more negatives than positives and then coding it this way appears in fact as a design flaw to me.
Don't know what you're saying here. Python's grammar has nothing to do with the relative number of positive vs negative entities; indeed, in a 2's-complement machine it's not even true that there are more negatives than positives. Python generates the unary minus for "negative literals" because, again, negative literals *don't exist* in the grammar.
A simple solution could be to do the opposite: Always store a negative number and negate it for positive numbers. ...
So long as negative literals don't exist in the grammar, "-2147483648" makes no sense on a 2's-complement machine with 32-bit C longs. There isn't "a problem" here worth fixing, although if there is <wink>, it will get fixed by magic as soon as Python ints and longs are unified.
Tim Peters wrote:
[Tim]
Python's grammar is such that negative integer literals don't exist; what you actually have there is the unary minus operator applied to positive integer literals; ...
[Christian Tismer]
Well, knowing that there are more negatives than positives and then coding it this way appears in fact as a design flaw to me.
Don't know what you're saying here.
On a 2's-complement machine, there are 2**(n-1) negatives, zero, and 2**(n-1)-1 positives. The most negative number cannot be inverted. Most machines today use the 2's complement.
Python's grammar has nothing to do with the relative number of positive vs negative entities; indeed, in a 2's-complement machine it's not even true that there are more negatives than positives.
If I read this 1's-complement machine then I believe it. But we don't need to split hair on known stuff :-)
Python generates the unary minus for "negative literals" because, again, negative literals *don't exist* in the grammar.
Yes. If I know the facts and don't build negative literals into the grammar, then I call it an oversight. Not too bad but not nice.
A simple solution could be to do the opposite: Always store a negative number and negate it for positive numbers. ...
So long as negative literals don't exist in the grammar, "-2147483648" makes no sense on a 2's-complement machine with 32-bit C longs. There isn't "a problem" here worth fixing, although if there is <wink>, it will get fixed by magic as soon as Python ints and longs are unified.
I'd change the grammar. ciao - chris -- Christian Tismer :^) mailto:tismer@appliedbiometrics.com Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com
On Mon, 8 May 2000, Christian Tismer wrote:
...
So long as negative literals don't exist in the grammar, "-2147483648" makes no sense on a 2's-complement machine with 32-bit C longs. There isn't "a problem" here worth fixing, although if there is <wink>, it will get fixed by magic as soon as Python ints and longs are unified.
I'd change the grammar.
That would be very difficult, with very little positive benefit. As Mark said, use 0x80000000 if you want that number. Consider that the grammar would probably want to deal with things like - 1234 or -0xA Instead, the grammar sees two parts: "-" and "NUMBER" without needing to complicate the syntax for NUMBER. Cheers, -g -- Greg Stein, http://www.lyra.org/
Greg Stein wrote:
On Mon, 8 May 2000, Christian Tismer wrote:
...
So long as negative literals don't exist in the grammar, "-2147483648" makes no sense on a 2's-complement machine with 32-bit C longs. There isn't "a problem" here worth fixing, although if there is <wink>, it will get fixed by magic as soon as Python ints and longs are unified.
I'd change the grammar.
That would be very difficult, with very little positive benefit. As Mark said, use 0x80000000 if you want that number.
Consider that the grammar would probably want to deal with things like - 1234 or -0xA
Instead, the grammar sees two parts: "-" and "NUMBER" without needing to complicate the syntax for NUMBER.
Right. That was the reason for my first, dumb, proposal: Always interpret a number as negative and negate it once more. That makes it positive. In a post process, remove double-negates. This leaves negations always where they are allowed: On negatives. ciao - chris -- Christian Tismer :^) mailto:tismer@appliedbiometrics.com Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com
On Tue, 9 May 2000, Christian Tismer wrote:
... Right. That was the reason for my first, dumb, proposal: Always interpret a number as negative and negate it once more. That makes it positive. In a post process, remove double-negates. This leaves negations always where they are allowed: On negatives.
IMO, that is a non-intuitive hack. It would increase the complexity of Python's parsing internals. Again, with little measurable benefit. I do not believe that I've run into a case of needing -2147483648 in the source of one of my programs. If I had, then I'd simply switch to 0x80000000 and/or assign it to INT_MIN. -1 on making Python more complex to support this single integer value. Users should be pointed to 0x80000000 to represent it. (a FAQ entry and/or comment in the language reference would be a Good Thing) Cheers, -g -- Greg Stein, http://www.lyra.org/
participants (5)
-
Christian Tismer
-
Greg Stein
-
Mark Hammond
-
Tim Peters
-
Trent Mick