[Python-Dev] Fwd: summing a bunch of numbers (or "whatevers")

Jack Jansen Jack.Jansen@oratrix.com
Sun, 20 Apr 2003 01:03:40 +0200


On zaterdag, apr 19, 2003, at 23:43 Europe/Amsterdam, Alex Martelli 
wrote:
> For the Nth time, today somebody asked in c.l.py about how best to sum
> a list of numbers.  As usual, many suggested reduce(lambda x,y:x+y, L),
> others reduce(int.__add__,L), others reduce(operator.add,L), etc, and 
> some
> (me included) a simple
>     total = 0
>     for x in L:
>         total = total + x
>
> The usual performance measurements were unchained (easier than ever
> thanks to timeit.py of course;-), and the paladins of reduce were once 
> again
> dismayed by the fact that the best reduce can do (that best is 
> obtained with
> operator.add) is mediocre (e.g. on my box with L=range(999), reduce 
> takes
> 330 usec, and the simple for loop takes 247).
>
[...]
> Now, I think the obvious approach would be to have a function sum,
> callable with any non-empty homogeneous sequence (sequence of
> items such that + can apply between them), returning the sequence's
> summation -- now THAT might help for simplicity, clarity AND power.
>
> So, I tried coding that up -- just 40 lines of C... it runs twice as 
> fast
> as the plain loop, for summing up range(999), and just as fast as 
> ''.join
> for summing up map(str, range(999)) [for the simple reason that I
> special-case this -- when the first item is a PyBaseString_Type, I
> delegate to ''.join].

Do you have any idea why your sum function is, uhm, three times faster 
than
the reduce(operator.add) version? Is the implementation of reduce doing
something silly, or are there shortcuts you can take that reduce() 
can't?

I'm asking because I think I would prefer reduce to give the speed you 
want.
That way, we won't have people come asking for a prod() function to 
match
sum(), etc.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -