[Python-Dev] Re: Memory size overflows
Gerald S. Williams
gsw@agere.com
Fri, 25 Oct 2002 13:07:14 -0400
I wrote:
> Shall I submit a patch?
Tim Peters wrote:
> Sure, but also submit your timing harness so that people can measure the
> effects cross-platform and cross-compiler.
Of course. I first wanted to see if there was any interest.
Integrated into Python, the speedup was much less: only 0.1%
overall for this million-multiply script:
import time
start = time.time()
a = -1000
while a <= 1000:
b = 1000
while b <= 1000:
b += 1
a += 1
end = time.time()
base = end - start
start = time.time()
a = -1000
while a <= 1000:
b = -1000
while b <= 1000:
b += 1
c = a * b
a += 1
end = time.time()
print end - start - base
The difference in speedup is an interesting data point on
Python overhead in general (it took just over 6 seconds to
do 1,000,000 multiplies on my system).
> Which platform? Which compiler? What was your test driver? Was this
> timing the mult code in isolation, or timing Python-level multiplies?
> Claims of small speedups are notoriously platform- and test-dependent. If
> it's a mixed bag across platforms, the risk of introducing a new bug would
> favor leaving things alone. In the absence of a clear correctness proof, a
> Python simulation program demonstrating correctness exhaustively in small
> bases would also be helpful.
I doubt you're still interested, but I used Cygwin on a
P3/833 Thinkpad running Win2K (GCC compiler). My original
results were on just the multiply code in isolation--I
wanted to see if there was interest before continuing.
Since I have them, I attached my initial testbed (new.c
is the final version), and here is an exhaustive test
for the "cannot_overflow" check used in that version:
from __future__ import generators
def cannot_overflow(a,b,m):
M_MASK = (1 << m) - 1
HI_LO_BIT = 1 << ((m/2) - 1)
OVL_IMPOSSIBLE_MASK = HI_LO_BIT - 1
OVL_POSSIBLE_MASK = M_MASK ^ OVL_IMPOSSIBLE_MASK
if a < 0:
check = OVL_POSSIBLE_MASK
else:
check = 0
if (a & OVL_POSSIBLE_MASK) == check:
if b < 0:
check = OVL_POSSIBLE_MASK
else:
check = 0
if (b & OVL_POSSIBLE_MASK) == check:
return True
return False
def overflowed(a,b,m):
VALID_MASK = (2**(m - 1)) - 1
INVALID_MASK = (2**(2*m) - 1) ^ VALID_MASK
prod = a * b
highbits = prod & INVALID_MASK
return (highbits != 0) and (highbits != INVALID_MASK)
def checkall(m):
for i in range(-(2**(m-1)),2**(m-1)):
for j in range(-(2**(m-1)),2**(m-1)):
if cannot_overflow(i,j,m) and overflowed(i,j,m):
yield (i,j)
if __name__ == "__main__":
for m in range(2,11,2):
print "M ==", m
for i in checkall(m):
print i
print "END OF LIST"
You can compare the "cannot_overflow" code to the code that
would be added to int_mul. For the sake of minimizing impact,
I limited the changes to defining UL_OVERFLOW_POSSIBLE_MASK
as the equivalent of ~(2**(m/2-1) - 1) and this code, added
immediately after converting the arguments into longs:
{
unsigned long ma = a & UL_OVERFLOW_POSSIBLE_MASK;
if (ma == ((a < 0) ? UL_OVERFLOW_POSSIBLE_MASK : 0))
{
unsigned long mb = b & UL_OVERFLOW_POSSIBLE_MASK;
if (mb == ((b < 0) ? UL_OVERFLOW_POSSIBLE_MASK : 0))
{
return PyInt_FromLong(a*b);
}
}
}
This seems pretty safe, but the payout seems too small
to warrant taking this any further. Unless someone asks
me to do so, I'm planning to drop it.
-Jerry