Hello, python-ideas. Trying to cover everything in one shot, so the message is long, sorry. sum() is a great function. It is the "obvious way" to add things. Unfortunately sometimes it's slower than it could be. The problem is that code: sum([[1,2,3]]*1000000, []) takes forever to complete. Let's fix that! This problem was there since the sum() introduction, but it is almost unknown among python-beginners. When people look at sum(seq,start=0) signature they most probably expect it to be like this: def sum(seq, start = 0): for item in seq: start += item return start But it is not, because in cases like: empty = [] result = sum(list_of_lists, empty) such implementation would modify content of "empty". This use-case is known, it is checked in test_builtin.py, and explicitly mentioned in comments inside sum() code. So actually sum() looks like this: def sum(seq, start = 0): for item in seq: start = start + item return start it creates a copy of the partial result on every "start + item". What I suggest is instead of making a copy for every item make just one copy and reuse it as many times as needed. For example: def sum(seq, start = 0): start = start + seq[0] for item in seq[1:]: start += item return start Patch implementing this idea attached to issue18305. Patch is simple, it just rearranges existing code. It should work for python 2.7, 3.3 and hg-tip. Except sum() becoming faster there should be no behavior change. Can this implementation break anything? Advantages: * Performance boost is like 200000% [1], nothing else becomes slower Disadvantages (inspired by Terry J. Reedy): * Other pythons (pypy, jython, older cpython versions) do not have such optimisation, people will move code that depends on the internal optimization to pythons that do not have it. And they will complain about their program 'freezing'. * It discourages people from carefully thinking about whether they actually need a concrete list or merely the iterator for a virtual list. Alternatives to sum() for this use-case: * list comprehension is 270% slower than patched sum [2] * itertools.chain is 50%-100% slower than patched sum [3] Main questions: * Whether to accept the patch * If yes, whether it should go to the new version or to a bugfix release Alternatives to this patch: * Reject the idea as a whole. Intentionally keep the sum() slow. * Accept the idea, but reject this particular implementation, instead use something different, e.g. implement special case optimization or use different approach (I thought about start=copy.copy(start)) In my opinion (looking though python changelog) performance patches were accepted for bugfix releases many times before. So as long as this patch does not break anything, it may go even to python 2.7. I think bad performance is a bug that should be fixed. Rejecting performance fixes just because older versions don't have it would mean no performance improvements to anything, ever. Sum is a great basic function. It keeps the simple code simple. This patch puts it on par with more advanced features like itertools, and allows people to use sum for simple things and itertools for complex memory-efficient algorithms. Questions that may arise if the patch is accepted: * sum() was rejecting strings because of this bug. If the bug gets fixed should another patch allow sum() to accept strings? * maybe in some distant future drop the second parameter (or make it None by default) and allow calling sum for everything, making sum() "the one obvious way" to sum up things? It would be nice if sum "just worked" for everything (e.g. sum() of empty sequence would return None, i.e. if there's nothing to sum then nothing is returned). But I think it needs more work for that, because even with this patch sum() is still ~20 times slower than "".join() [4] That's all. Any suggestions welcome. -- [1] Python 2.7.5 Before patch: $ ./python -mtimeit --setup="x=[[1,2,3]]*10000" "sum(x,[])" 10 loops, best of 3: 885 msec per loop After patch: $ ./python -mtimeit --setup="x=[[1,2,3]]*10000" "sum(x,[])" 1000 loops, best of 3: 524 usec per loop [2] Python 2.7.5 with patch: $ ./python -mtimeit --setup="x=[[1,2,3]]*10000" "[i for l in x for i in l]" 1000 loops, best of 3: 1.94 msec per loop [3] Python 2.7.5 with patch: $ ./python -mtimeit --setup="x=[[1,2,3]]*10000" --setup="from itertools import chain" "list(chain.from_iterable(x))" 1000 loops, best of 3: 821 usec per loop $ ./python -mtimeit --setup="x=[[1,2,3]]*10000" --setup="from itertools import chain" "list(chain(*x))" 1000 loops, best of 3: 1.03 msec per loop [4] Python 2.7.5 with patch and string check removed: $ ./python -mtimeit --setup="x=['a']*10000" "sum(x,'')" 100 loops, best of 3: 3.98 msec per loop $ ./python -mtimeit --setup="x=['a']*10000" "''.join(x)" 10000 loops, best of 3: 170 usec per loop
participants (26)
-
Alexander Belopolsky
-
Andrew Barnert
-
Ben Finney
-
Bruce Leban
-
Chris Kaynor
-
David Mertz
-
Ethan Furman
-
Haoyi Li
-
Joshua Landau
-
Joshua Landau
-
Juancarlo Añez
-
Mathias Panzenböck
-
MRAB
-
Nick Coghlan
-
Oscar Benjamin
-
Paul Moore
-
Ron Adam
-
Ronald Oussoren
-
Sergey
-
Serhiy Storchaka
-
Stefan Behnel
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Terry Reedy
-
Victor Stinner
-
Vito De Tullio