[Python-ideas] Fast sum() for non-numbers

Haoyi Li haoyi.sg at gmail.com
Thu Jul 4 12:01:45 CEST 2013


Random thought: this kind of "different implementations for different
types" seems exactly what PEP443 (http://www.python.org/dev/peps/pep-0443/)
is about; it would save you having the nasty big chunk of if-elses within
sum() itself, and would let other people incrementally implement special
sums for their own special data types without having to muck with the std
lib code.


On Thu, Jul 4, 2013 at 5:54 PM, Sergey <sergemp at mail.ru> wrote:

> On Jul 04, 2013 Steven D'Aprano wrote:
>
> This message is long, so here's its short summary:
> * Unfortunately list.extend does not look like the obvious way, and
>   its slower than alternatives.
> * Original patch reduces memory, not increases it, there can be no
>   MemoryError because of it.
> * sum() can *always* be fast! (patch and tests)
> * linked list is O(n) where n is number of lists to add
> * using __add__/__iadd__ for containers is a useful feature
>
> > How about the "Obvious Way" to concatenate lists?
> > new = []
> > for x in seq:
> >      new.extend(x)
>
> 200% slower than patched sum, 50-100% slower than both itertools and
> 25% faster than list comprehension. [1] It's basically not even mentioned
> among most popular answers to list flattening:
>
> http://stackoverflow.com/questions/952914/making-a-flat-list-out-of-list-of-lists-in-python
>
> http://stackoverflow.com/questions/952914/making-a-flat-list-out-of-list-of-lists-in-python
> Slower, less known, it's much more characters to type, hey, it's not
> even one-liner! :) What makes you think this is the "Obvious Way"?
>
> > -0 on the general idea, -1 on the specific implementation. I'd rather
> have
> > sum of lists be slow than risk sum of numbers raise MemoryError.
>
> You must be misunderstanding something. Or maybe I've explained it
> poorly. Numbers have different code path in sum() that my patch does
> not touch. But even if it did, my patch never makes a copy of original
> list, it may only reduce amount of memory used, not increase it.
> There was an alternative idea (that I never implemented), suggesting
> to make a copy of `start` variable, but not the list.
>
> > sum simply cannot *always* be fast. E.g. summing tuples will still
> > be slow even with your suggestion.
>
> Yes, it can! That's the point of the original idea!
>
> The original patch [2] optimizes lists, because it was easy to do.
> But nothing stops you from optimizing other (two?) types. For example
> this patch [3] optimizes lists, tuples and strings.
>
> Basically it works by storing temporary result in a list, and then
> converts it to tuple or string in one shot if needed.
>
> Using this patch you get the O(n) sum for lists, tuples and strings
> [4]. And combining it with original patch you get the fastest sum()
> possible.
>
> Even being O(n) for strings, it's slower than ''.join(), but it is
> constantly slower now. I can't beat ''.join() because of function
> call overhead. Internally join() converts the sequence into tuple,
> thus saving a lot of calls, but using a lot of additional memory,
> that's why:
>   ''.join('' for _ in xrange(100000000))
> eats ~1GB of RAM before giving you an empty string.
>
> > I can't gather any enthusiasm for optimizing sum to speed up
> > concatenation.
>
> No problem, just tell what approach you think is the best and I'll
> try implementing it.
>
> > I'm hostile to any attempt to encourage people to treat sum() as
> > the preferred way to concatenate large amounts of data, because
> > that will surely lead them into bad habits and will end with them
> > trying to sum() a lot of tuples or linked lists or something and
> > getting O(n**2) performance.
>
> Why not just make sum O(n) for everything? I've already done that for
> lists, tuples and strings. As for linked lists, the point of linked
> list is to insert items fast. So any decent implementation of it
> should store a pointer to its head and tail, should implement a O(1)
> __iadd__ using tail pointer, and thus falls under my first patch.
> There's not much sence in [single] linked list if it has no __iadd__.
>
> > Yes. If Python used & for concatenation, we wouldn't have to worry
> > about sum(lists or strings or tuples) being quadratic, because people
> > wouldn't call sum on lists, strings or tuples.
>
> Heh. If Python had no sum() we wouldn't have to worry about people
> using it. If Python had no lists we wouldn't have to worry about
> people concatenating them. If there was no Python we wouldn't have
> to worry at all. But the world would be poor without all these great
> things...
>
> Seriously, I miss add for set(). I needed it when I had a dictionary
> like {x:set(...), ...} and needed a set of all the values from it.
> I wanted to use sum(dict.values()), that would be easy and obvious,
> but I couldn't, because set() does not support __add__. So I had
> to write a few lines of loop instead of a few characters. :(
>
> --
>
> [1] Python 2.7.5 with patch, testing list.extend():
> $ ./python -mtimeit --setup="x=[[1,2,3]]*10000" "sum(x,[])"
> 1000 loops, best of 3: 531 usec per loop
>
> $ ./python -mtimeit --setup="x=[[1,2,3]]*10000" "[i for l in x for i in l]"
> 1000 loops, best of 3: 1.94 msec per loop
>
> $ ./python -mtimeit --setup="x=[[1,2,3]]*10000" --setup="from itertools
> import chain" "list(chain.from_iterable(x))"
> 1000 loops, best of 3: 820 usec per loop
>
> $ ./python -mtimeit --setup="x=[[1,2,3]]*10000" --setup="from itertools
> import chain" "list(chain(*x))"
> 1000 loops, best of 3: 1.03 msec per loop
>
> $ ./python -mtimeit --setup="x=[[1,2,3]]*10000" "l = []" "for i in x:
> l.extend(i)"
> 1000 loops, best of 3: 1.53 msec per loop
>
> [2] http://bugs.python.org/file30705/fastsum.patch
>
> [3] http://bugs.python.org/file30769/fastsum-special.patch
>     http://bugs.python.org/issue18305#msg192281
>
> [4] Python 2.7.5, testing new fastsum-special.patch:
> === Lists ===
> No patch:
>   $ ./python -mtimeit --setup="x=[[1,2,3]]*10000" "sum(x,[])"
>   10 loops, best of 3: 885 msec per loop
> fastsum.patch:
>   $ ./python -mtimeit --setup="x=[[1,2,3]]*10000" "sum(x,[])"
>   1000 loops, best of 3: 524 usec per loop
> fastsum-special.patch:
>   $ ./python -mtimeit --setup="x=[[1,2,3]]*10000" "sum(x,[])"
>   1000 loops, best of 3: 298 usec per loop
> Result: 3000 times faster.
>
> === Tuples ===
> No patch:
>   $ ./python -mtimeit --setup="x=[(1,2,3)]*10000" "sum(x,())"
>   10 loops, best of 3: 585 msec per loop
> fastsum.patch:
>   $ ./python -mtimeit --setup="x=[(1,2,3)]*10000" "sum(x,())"
>   10 loops, best of 3: 585 msec per loop
> fastsum-special.patch:
>   $ ./python -mtimeit --setup="x=[(1,2,3)]*10000" "sum(x,())"
>   1000 loops, best of 3: 536 usec per loop
> Result: 1000 times faster.
>
> === Strings ===
> No patch (just string check removed):
>   $ ./python -mtimeit --setup="x=['abc']*100000" "sum(x,'')"
>   10 loops, best of 3: 1.52 sec per loop
> fastsum.patch (+ string check removed):
>   $ ./python -mtimeit --setup="x=['abc']*100000" "sum(x,'')"
>   10 loops, best of 3: 1.52 sec per loop
> fastsum-special.patch
>   $ ./python -mtimeit --setup="x=['abc']*100000" "sum(x,'')"
>   10 loops, best of 3: 27.8 msec per loop
> join:
>   $ ./python -mtimeit --setup="x=['abc']*100000" "''.join(x)"
>   1000 loops, best of 3: 1.66 msec per loop
> Result: 50 times faster, but still constantly slower than join.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130704/0bb8a13c/attachment.html>


More information about the Python-ideas mailing list