[Python-ideas] Yet another sum function (fractions.sum)

Oscar Benjamin oscar.j.benjamin at gmail.com
Wed Aug 21 00:26:23 CEST 2013


On 20 August 2013 19:14, David Mertz <mertz at gnosis.cx> wrote:
> Oscar's improvement to my binsum() idea, using Stefan's excellent point that
> we should just use int.__add__ on the numerators, is quite elegant.  I.e.:
>
> def binsum(iterable):
>     bins = defaultdict(int)
>     for num in iterable:
>         bins[num.denominator] += num.numerator
>     return mergesum([F(n, d) for d, n in bins.items()])
>
> Moreover, looking at Oscar's data, it looks like the improved binsum()
> ALWAYS beats rationalsum()[*]
>
> [*] OK, technically there are three cases where this isn't true: d1000/n=100
> and d10/n=1000 where it is about one percent slower (although I think that
> is in the noise of timing it); and the "pathological" case of calculating
> 'e', where no denominators repeat and where rationalsum() beats everything
> else by a large margin at the asymptote (or maybe imergesum() does, but it
> is because they behave the same here).

There's nothing pathological about at that case. It's very common to
have series where the denominators all differ. What you said got me
thinking, though, about why it underperformed in that case and I
realised that it's because binsum reorders the numbers before passing
them to mergesum and therefore cannot take advantage of the natural
ordering in the series.

So with all our powers combined I created binsortmergesum (bsms) and
it looks like this:

def bsms(iterable):
    bins = defaultdict(int)
    for num in iterable:
        bins[num.denominator] += num.numerator
    return mergesum([F(n, d) for d, n in sorted(bins.items())])

The new timing results show that bsms is slightly poorer than binsum
in the cases where it does well but does much better in the case where
it did badly. So here's the new timing results:

--------------
bigint
---------------
     n | sum    | merges | imerge | binsum | sortsu | ration | bsms
------------------------------------------------------------
    10 | 2      | 20     | 27     | 28     | 28     | 50     | 31
   100 | 21     | 73     | 208    | 93     | 102    | 126    | 96
  1000 | 212    | 506    | 2162   | 749    | 762    | 943    | 746
 10000 | 2283   | 5644   | 21975  | 7407   | 8503   | 9204   | 7463
100000 | 23940  | 63465  | 2.3e+5 | 75810  | 93666  | 91938  | 74516


--------------
float
---------------
     n | sum    | merges | imerge | binsum | sortsu | ration | bsms
------------------------------------------------------------
    10 | 435    | 417    | 430    | 377    | 426    | 393    | 373
   100 | 4437   | 4298   | 4549   | 1220   | 4351   | 1252   | 1231
  1000 | 45889  | 43373  | 45424  | 3520   | 44004  | 3743   | 3578
 10000 | 4.6e+5 | 4.7e+5 | 4.5e+5 | 23759  | 4.4e+5 | 25135  | 23601
100000 | 4.8e+6 | 4.4e+6 | 4.5e+6 | 2.2e+5 | 4.4e+6 | 2.3e+5 | 2.2e+5


--------------
d1000
---------------
     n | sum    | merges | imerge | binsum | sortsu | ration | bsms
------------------------------------------------------------
    10 | 339    | 296    | 304    | 497    | 322    | 528    | 505
   100 | 9747   | 3554   | 3872   | 5065   | 3655   | 5197   | 5155
  1000 | 4.2e+5 | 39259  | 42661  | 35199  | 37243  | 36470  | 35317
 10000 | 9.3e+6 | 4.1e+5 | 4.3e+5 | 80244  | 2.9e+5 | 81798  | 80658
100000 | 1e+8   | 4.2e+6 | 4.4e+6 | 2.6e+5 | 2.7e+6 | 6e+5   | 2.6e+5


--------------
d10
---------------
     n | sum    | merges | imerge | binsum | sortsu | ration | bsms
------------------------------------------------------------
    10 | 214    | 219    | 226    | 204    | 236    | 232    | 206
   100 | 2286   | 2245   | 2404   | 514    | 2353   | 563    | 519
  1000 | 23485  | 22776  | 24744  | 2130   | 23284  | 2347   | 2119
 10000 | 2.4e+5 | 2.3e+5 | 2.5e+5 | 18265  | 2.3e+5 | 20109  | 18098
100000 | 2.4e+6 | 2.3e+6 | 2.5e+6 | 1.8e+5 | 2.3e+6 | 2e+5   | 1.8e+5


--------------
e
---------------
     n | sum    | merges | imerge | binsum | sortsu | ration | bsms
------------------------------------------------------------
    10 | 239    | 230    | 239    | 387    | 247    | 413    | 386
   100 | 15221  | 4330   | 4794   | 14786  | 4483   | 6431   | 6361
  1000 | 1.6e+7 | 1.4e+6 | 1.5e+6 | 1.5e+7 | 1.5e+6 | 1.6e+6 | 1.8e+6


Oscar


More information about the Python-ideas mailing list