[Python-ideas] Yet another sum function (fractions.sum)
David Mertz
mertz at gnosis.cx
Fri Aug 16 22:06:35 CEST 2013
My intuition was that one might do better than mergesum() by binning
numbers. I.e. this implementation:
def binsum(seq):
bins = dict()
for f in seq:
bins[f.denominator] = bins.get(f.denominator, 0) + f
return mergesum(list(bins.values()))
Indeed I am right, but the effect doesn't show up until one looks at a
fairly large collection of fractions:
538-code % python3 -m timeit -s 'from tmpsum import mergesum, binsum,
nums;
nums=nums[:50000]' 'mergesum(nums)'
10 loops, best of 3: 806 msec per loop
539-code % python3 -m timeit -s 'from tmpsum import mergesum, binsum,
nums;
nums=nums[:50000]' 'binsum(nums)'
10 loops, best of 3: 627 msec per loop
binsum() beats sum() at much smaller sizes as well, but it doesn't beat
simple mergesum() at the small sizes. This is true, BTW, even if binsum()
only uses sum() on the last line; but there's an extra boost in speed to
use mergesum() there.
I'm not sure whether one might get a better binsum() by binning not on
denominator itself, but binning together everything with a denominator that
is a multiple of a stored denominator. In principle, that could be many
fewer bins with negligible gcd() calculation needed; however, the extra
testing needed to check for multiples might override that gain. There are
several variations that come to mind, but I haven't tested them.
On Fri, Aug 16, 2013 at 11:51 AM, Oscar Benjamin <oscar.j.benjamin at gmail.com
> wrote:
> The discussion around having a sum() function in the statistics module
> reminded me that despite the fact that there are two functions in the
> stdlib (and more in the third-party libraries I use) I have previously
> rolled my own sum() function. The reason is that the stdlib does not
> currently contain a function that can sum Fractions in linear time for
> many inputs even though it is possible to implement a function that
> achieves closer to linear performance in more cases. I propose that
> there could be a sum() function or class-method in the fractions
> module for this purpose. I would just raise this on the tracker but
> seeing how many feathers were ruffled by statistics.sum I thought I'd
> suggest it here first.
>
> To demonstrate the problem I'll show how a quick and dirty mergesum()
> can out-perform sum():
>
> $ cat tmpsum.py
> # tmpsum.py
>
> # Generate data
> from random import randint, seed
> from fractions import Fraction as F
> seed(123456789) # Use the same numbers each time
> nums = [F(randint(-1000, 1000), randint(1, 1000)) for _ in
> range(100000)]
>
> # 1) mergesum() is more efficient than sum() with Fractions.
> # 2) It assumes associativity of addition since it reorders the sum.
> # 3) It performs the same number of __add__ operations as sum()
> # 4) A more complicated iterator version is possible.
> def mergesum(seq):
> while len(seq) > 1:
> new = [a + b for a, b in zip(seq[:-1:2], seq[1::2])]
> if len(seq) % 2:
> new.append(seq[-1])
> seq = new
> return seq[0]
>
> # Just a quick test
> assert mergesum(nums[:101]) == sum(nums[:101])
>
> So now let's time sum() with these numbers:
>
> $ python -m timeit -s 'from tmpsum import mergesum, nums;
> nums=nums[:10]' 'sum(nums)'
> 1000 loops, best of 3: 206 usec per loop
> $ python -m timeit -s 'from tmpsum import mergesum, nums;
> nums=nums[:100]' 'sum(nums)'
> 100 loops, best of 3: 6.24 msec per loop
> $ python -m timeit -s 'from tmpsum import mergesum, nums;
> nums=nums[:1000]' 'sum(nums)'
> 10 loops, best of 3: 320 msec per loop
> $ python -m timeit -s 'from tmpsum import mergesum, nums;
> nums=nums[:10000]' 'sum(nums)'
> 10 loops, best of 3: 6.43 sec per loop
>
> Each time we increase the size of the input by a factor of 10 the
> computation time increases by a factor of about 30. This is caused by
> computing the gcd() to normalise the result between each increment to
> the total. As the numerator and denominator of the subtotal get larger
> the time taken to compute the gcd() after each increment increases.
> The result is that the algorithm overall costs somewhere between
> linear and quadratic time [from the above maybe it's O(n**(3/2))].
>
> Now let's see how mergesum() performs:
>
> $ python -m timeit -s 'from tmpsum import mergesum, nums;
> nums=nums[:10]' 'mergesum(nums)'
> 10000 loops, best of 3: 186 usec per loop
> $ python -m timeit -s 'from tmpsum import mergesum, nums;
> nums=nums[:100]' 'mergesum(nums)'
> 100 loops, best of 3: 2.16 msec per loop
> $ python -m timeit -s 'from tmpsum import mergesum, nums;
> nums=nums[:1000]' 'mergesum(nums)'
> 10 loops, best of 3: 24.6 msec per loop
> $ python -m timeit -s 'from tmpsum import mergesum, nums;
> nums=nums[:10000]' 'mergesum(nums)'
> 10 loops, best of 3: 256 msec per loop
> $ python -m timeit -s 'from tmpsum import mergesum, nums;
> nums=nums[:100000]' 'mergesum(nums)'
> 10 loops, best of 3: 2.59 sec per loop
>
> (I didn't time sum() with a 100000 input to compare with that last run).
>
> Notice that mergesum() out-performs sum() for all input sizes and that
> the time scaling is much closer to linear i.e. 10x the input takes 10x
> the time. It works by summing adjacent pairs of numbers halving the
> size of the list each time. This has the benefit that the numerator
> and denominator in most of additions are a lot smaller so that their
> gcd() is computed faster. Only the final additions need to use the
> really big numerator and denominator. The performance of both sum()
> functions are sensitive to the distribution of inputs and in
> particular the distribution of denominators but in my own usage a
> merge-sum algorithm has always been faster.
>
> I have found this useful for myself (using even larger lists of
> numbers) when using the fractions module and I propose that something
> similar could be added to the fractions module.
>
>
> Oscar
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
--
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons. Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130816/af60eaac/attachment.html>
More information about the Python-ideas
mailing list