[Python-ideas] Intermediate Summary: Fast sum() for non-numbers

David Mertz mertz at gnosis.cx
Sun Jul 14 22:42:48 CEST 2013


On Sun, Jul 14, 2013 at 12:26 PM, Sergey <sergemp at mail.ru> wrote:

> * Sum is not obvious (for everyone) way to add lists, so people
>   should not use it, as there're alternatives, i.e. instead of
>   - sum(list_of_lists, [])
>   one can use:
>   - reduce(operator.iadd, list_of_lists, [])
>   - list(itertools.chain.from_iterable(list_of_lists))
>   - result = []
>     for x in list_of_lists:
>         result.extend(x)
>

It seems to me that in order to make sum() look more attractive, Sergey
presents ugly versions of alternative ways to (efficiently) concatenate
sequences.

One can make these look much nicer, e.g. (assuming there is a 'from
itertools import chain' at the very top of the file, which is the sensible
place to put it).

  # If 'list_of_lists' really is as it is named, there is no need to treat
it
  # as generic iterable.  Moreover, one doesn't usually need to make an
  # actual instantiated list from chain() for most purposes.  So:
  flat = chain(list_of_lists)

  # If we do start with an iterable of lists, but know it isn't infinite,
just use:
  flat = chain(*iter_of_lists)

If it is really needed, of course chain.from_iterable() can be used.
Although the only time you'd want that is when the iterable is potentially
infinite, and in that case you *definitely* don't want to make it back into
a list either, just:

  inf_flat = chain.from_iterable(endless_lists)

Another approach in one of the links Sergey gave is nice too, and shorter
and more elegant than any of his alternatives:

  flat = []
  map(flat.extend, list_of_lists)

Using map() for a side effect is slightly wrong, but this is short,
readable, and obvious in purpose.

On the other hand, as I've said before, when I read:

  flat = sum(list_of_lists, [])

It just looks WRONG!  Yes, I know why it works, because of some quirks of
Python internals.  But it absolutely doesn't *read* like it should mean
what it does or that it should necessarily even work at all.  The word SUM
is self-evidently and intuitively about *adding numbers* and *not* about
"doing something that is technically supported because other things have an
.__add__() method".

As various people have observed, if Python used some other operator for
concatenation, we wouldn't be having this discussion at all.  E.g. if we
had:

  concat = [1, 2, 3] . [4, 5, 6]

Then we might have a method called .__concat__() on various collections.
Conceptually that really is what Python is doing now.  It's just that Guido
made the very reasonable decision that the symbol "+" was something users
could intuitively read as meaning concatenation when appropriate, but as
addition in other cases.

I definitely don't prefer some other operator than '+' to concatenate
sequences.  However, I think possibly if I had a time machine I might go
back and change the spelling of .__add__() to .__plus__().  That might more
clearly indicate that we don't really mean "mathematical addition" but
rather simply "what the plus sign does".


-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130714/5eb32607/attachment.html>


More information about the Python-ideas mailing list