find all multiplicands and multipliers for a number

Chris Angelico rosuav at gmail.com
Sat Apr 11 23:10:47 CEST 2015


On Sun, Apr 12, 2015 at 3:52 AM, Paul Rubin <no.email at nospam.invalid> wrote:
>> PS Note that you're being "wasteful" by multiplying c*c over and over
>
> Yeah this is a reasonable point, though most of the c's should fit in a
> machine word, at least in my 64-bit system.  I think Python still
> separates ints and longs in the implementation.

I don't think it does. Performance doesn't seem to change in Py3 as
the numbers get bigger:

rosuav at sikorsky:~$ cat perftest.py
def naive_sum(top,base=0):
    i=0
    while i<top:
        i+=1
        base+=i
    return base

# Correctness test
print("Sum of numbers from 1 to 10 is: %d"%naive_sum(10))
print("Sum of numbers from 1 to 20 is: %d"%(naive_sum(20,1000)-1000))

import timeit
for base in (0, 10, 60, 100, 200):
    print("Base: 2**%d"%base)
    # Simpler than doing the whole iteration-judging thing ourselves
    timeit.main([
        "-s","from __main__ import naive_sum",
        "naive_sum(1000,%d)"%(2**base)
    ])

rosuav at sikorsky:~$ python2 perftest.py
Sum of numbers from 1 to 10 is: 55
Sum of numbers from 1 to 20 is: 210
Base: 2**0
10000 loops, best of 3: 89.6 usec per loop
Base: 2**10
10000 loops, best of 3: 89.9 usec per loop
Base: 2**60
10000 loops, best of 3: 92.3 usec per loop
Base: 2**100
10000 loops, best of 3: 145 usec per loop
Base: 2**200
10000 loops, best of 3: 153 usec per loop

Python 2.7: Clear difference in timing once all the numbers we're
using are above 2**64.

rosuav at sikorsky:~$ python3 perftest.py
Sum of numbers from 1 to 10 is: 55
Sum of numbers from 1 to 20 is: 210
Base: 2**0
10000 loops, best of 3: 145 usec per loop
Base: 2**10
10000 loops, best of 3: 144 usec per loop
Base: 2**60
10000 loops, best of 3: 155 usec per loop
Base: 2**100
10000 loops, best of 3: 151 usec per loop
Base: 2**200
10000 loops, best of 3: 156 usec per loop

Python 3.5: Much more consistent timing. Similarly, sticking an
explicit "L" suffix onto the number gives Py2 consistent timings:

rosuav at sikorsky:~$ python perftest.py
Sum of numbers from 1 to 10 is: 55
Sum of numbers from 1 to 20 is: 210
Base: 2**0
10000 loops, best of 3: 130 usec per loop
Base: 2**10
10000 loops, best of 3: 129 usec per loop
Base: 2**60
10000 loops, best of 3: 138 usec per loop
Base: 2**100
10000 loops, best of 3: 139 usec per loop
Base: 2**200
10000 loops, best of 3: 140 usec per loop

So it's the data type, not the size of the numbers, that makes the difference.

rosuav at sikorsky:~$ pike perftest.pike
Sum of numbers from 1 to 10 is: 55
Sum of numbers from 1 to 20 is: 210
Base: 2**0
100000 loops, best of 3: 18.3 usec per loop
Base: 2**10
100000 loops, best of 3: 18.5 usec per loop
Base: 2**60
100000 loops, best of 3: 18.5 usec per loop
Base: 2**100
10000 loops, best of 3: 398 usec per loop
Base: 2**200
10000 loops, best of 3: 406 usec per loop

Like Python 3, Pike has a single "int" type which stores
arbitrary-precision integers. Like Python 2, it has an optimization
for small ones. (One you can't defeat by using "2L".) Interestingly,
once you defeat Pike's optimizer, it's actually quite a lot slower
than Py3 at repeated bignum arithmetic.

Conclusion: Python 3 has one single integer type with consistent
performance across the board. "Machine word" is a meaningless concept
to Py3.

ChrisA



More information about the Python-list mailing list