[Python-ideas] Fast sum() for non-numbers

Sergey sergemp at mail.ru
Fri Jul 12 03:34:19 CEST 2013


On Jul 9, 2013 Andrew Barnert wrote:

>> 1. Make sum faster for everything BUT tuples and write in a manual:
>>    ...
>> 2. Implement just one (!) special case for the only type in python
>>   needing it and write:
>>    ...
>> 
>> I chose #2. Tuple is one of the most frequently used types in python,
>> and it's the only type that needs such a special case.
>> 
>> Until someone writes a better solution: Practicality beats purity.
>> That was my motivation.
> 
> #3 is for the docs to say what they currently say--sum is fast for
> numbers--but change the "not strings" to "not sequences", and maybe
> add a note saying _how_ to concatenate sequences (usually join for
> strings and chain for everything else).

Which effectively means do nothing and agree that slow sum is the best
for python. Despite we CAN make it faster. For all cases, or for most
of them, but we could at least start discussing options, instead of
repeating excuses.

>> Theoretically it's possible to rewrite a tuple type to internally use
>> list type for storage, and additionally have a `localcount` variable
>> saying how many items from that list belong to this tuple. Then
>> __add__ for such tuple would just create a new tuple with exactly
>> same internal list (with incremented ref.count) extended. This way,
>> technically, __add__ would modify all the tuples around, but due to
>> internal `localcount` variable you won't notice that.
> 
> I was going to point out the side effects of such a change, but
> someone beat me to it.

Yes, as a side-effect it would sometimes use less memory (and it could
also use more memory, however I can't think of any real-world cases
where that could be a problem). But the main goal was met: it would
make tuple O(N) summable. So:

>> Would you like such a patch instead? Would you want to write it? ;)

>> It's just this patch only optimizes add, which is ONLY needed for
>> many sequential adds, i.e. for sum(), so I thought that it would be
>> MUCH easier to add it to sum instead.
> 
> And it's even easier to add neither.

It was even easier to not create python at all, than we wouldn't have
to spend our time here discussing it. It's always easy to do nothing.
But sometimes it takes so much time to do something good...

>> Which is just one type — tuple. There's no other summable standard
>> types in python having O(N) __add__, is it?
> 
> Does nobody ever use types from the stdlib, third-party libs, or
> their own code in your world? Builtin types are not generally
> magical. A function that works well for all builtin types, but not
> types in the stdlib, is a recipe for confusion.

What types from stdlib it does not work for?

>>> So, if you're suggesting that sum can be fast for anything
>>> reasonable, you're just wrong.
>> 
>> I suggested two ways how to do that. First, one can use the approach
>> above, i.e. use mutable type internally. Second, for internal cpython
>> types we have a simpler option to implement optimization directly
>> in sum(). And there may be many others, specific to the types in
>> question.
> 
> Using a mutable type internally will almost always have side
> effects, or at least complicate the implementation.

Yes, optimization may or may not complicate implementation. Nothing new
here. For example "optimizing" tuple most probably won't complicate
implementation, it would just make it different, but faster to __add__.

> What you're suggesting is that theoretically, for some different
> language that placed a very strange and sometimes hard to meet
> requirement on all types, sum could be fast for all types. That
> doesn't mean sum can be fast for all Python types, because Python
> doesn't, and shouldn't, have such a requirement.

No. What I suggested is that sum() could be faster for SOME types.
And I provided a patch for that. But you (or someone else) said
that I can't make sum fast for other types, for example for tuples.
So I provided a patch doing that. Then you said that I can't do
that for tuples without patching sum. And I explained how it could
be done too.

I never put any requirements.

It's just you somewhy placed a strange requirement on sum() that it
must always remain slow...

> And again, what would be the benefit?

Hm. Benefits of O(N)-summable builtin types?
* No more surprises "Oh, sum() is O(N**2) Why?"
* People can use whatever builtin functions they want with any built-in
  types they want in any obvious to them way and nobody would say them
  that it will be too slow. A little slower, maybe, but not too much.
* Programs become faster
* Code becomes simpler and easier to read
* Python becomes a better language
* More people start using python
* Everybody's happy, world piece, etc.

>> I don't remember saying that every collection type in a world is
>> O(N) summable, but ok. Would you agree that all summable built-in
>> collection types of python become O(N) summable with "my design"?
> 
> Yes, but if it's impossible to add a new collection type that
> works like tuple, that makes python worse, not better.

Why would it be impossible? In worst case it would be hard (but not
impossible) to add a new collection type, that is as fast as tuple,
yes, so what? It is already hard to do that, nothing new there.

>> I.e. they were not O(N) summable before my patch, but they are O(N)
>> after it. Then why don't you like the patch?
>> 
>> Because somewhere in the world there could theoretically exist some
>> other types that are still not O(N) summable?
> 
> No, because all over the world there already actually exist such
> types, and because it's trivial--and useful--to create a new one.

Could you show me some code where a lot of those custom summable
sequences are added together? Or maybe about someone, complaining
about sum being slow for them? No examples? Then, this is not
a problem, right? ;)

>> Maybe, then we (or
>> their authors) will deal with them later (and I tried to show you
>> the options for your single linked list example).
> 
> And you only came up with options that destroy vital features of
> the type, making it useless for most people who would want to use it.

I did not. Or are you saying that STL developers also destroyed vital
features of your the because they have not implemented std::list the
same way? Have python destroyed those vital features with it's list?
Your type is not summable. You asked for the type that is summable.
I suggested one to you, you did not liked it. That's it, nothing is
destroyed.

> But, more importantly, it is very simple to design and implement
> new collection types in Python today. Adding a new requirement that's
> hard enough to reach--one that you haven't been able to pull it off
> for the first example you were given--implies that it would no longer
> be easy.

What requirements are you talking about?

> Then you misunderstood it. Tuples were offered as one example, out
> of many, that won't be fast. Solving that one example by adding a
> special case in the C code doesn't help the general problem

That's why I explained how this can also be done without writing
a special case in sum(). It's just if you have a goal and two ways
to reach it, one of them is extremely complicated, and another one
is easy, other than that they're equal, which one will you choose?

>>> But I agree with the sentiment. There is an efficient way to
>>> concatenate a lot of cons lists (basically, just fold backward
>>> instead of forward), but sum is not it.
>> 
>> Hm... If you implement fast __radd__ instead of __add__ sum would
>> use that, wouldn't it? Is that the easy way you were looking for?
> 
> First, how do you propose that sum find out whether adding or
> radding is faster for a given type?
>
> More importantly: that wouldn't actually do anything in this case.
> The key to the optimization is doing the sequence of adds in reverse
> order, not flipping each add.

Yea, I agree, it won't solve the problem right away, since elements
would still be radded left-to-right. Well, you still have at least
two options for your cons-lists: either use a `tail` attribute
of your list or as a some global variable. I.e.
  list_of_cons_lists[0].tail = find_tail(list_of_cons_lists[0])
  sum(list_of_cons_lists)
and use `tail` in your __add__ implementation. It's that simple.

Or, since you're writing a custom type anyway, and you have to
specify start class for sum() you can use it to hold a tail for you.
Like:
  class ConsListsSum:
    tail = None
    def __add__(self, other):
      if self.tail is None:
        self.tail = other
      else:
        self.tail.next = other
      while self.tail.next is not None:
        self.tail = self.tail.next
      return self

  sum(list_of_cons_lists, ConsListsSum())

In that case to use sum you won't need __add__ in your list at all.

>> Then, what way you suggest to be preferred? For example how would you
>> do that in py4k? I guess you would prefer sum() to reject lists and
>> tuples as it now rejects strings and use some other function for them
>> instead? Or what? What is YOUR preferred way?
> 
> I've already answered this, but I'll repeat it.
> 
> sum is not the obvious way to concatenate sequences today, and my
> preferred way is for that to stay true. So, I'm: [...]

You're saying what IS NOT your preferred way. But I'm asking what IS
your preferred way. Do you prefer to have a slow sum in python and
people asking why it's slow forever? Do you see that as a best
possible case for python?

>> We have 'list' and 'listiterator', 'tuple' and 'tupleiterator', 'set'
>> and 'setiterator'. Nothing unusual here. And no issues about them.
> 
> But they aren't the same kind of thing at all. I don't want to
> explain the differences between what C++ calls iterators and what
> Python calls iterators unless it's necessary. But briefly, a
> std::list::iterator is a mutable reference to a node.

And std::list::const_iterator is not. So?

> Exposing that type means--as I've already said twice--that you end
> up with exactly the same problems you have exposing the node
> directly.

No, I won't, unless I make it mutable too. And I don't have to.

>> Not all of them, but some. I.e. if you used your cons-lists as queue
>> or stack, then deque is a good replacement. 
> 
> Well, yes, but a dynamic array like Python's list is also a
> perfectly good stack. So what?

Nothing. It's just you said:
  >>>> If you make the lists and nodes separate types, and
  >>>> the nodes private, you have to create yet a third type,
  >>>> like the list::erator type in C++
So I answered that I don't always have to do that. I may not need
iterator for some of cons-list use cases.

>> Really, I don't understand that point. Are you saying, that sum()
>> must remain slow for FREQUENTLY USED standard types just because
>> there MAY BE some other types for which it would still be slow?
> 
> You're twisting the emphasis drastically, but basically yes.

And that's the only reason? What if I solve that too? E.g. what if
there would be a common way exposed to all the types in a world?
For example imaging a special case (or is it "general case" now)
like this:
  if hasattr(type(start), "__init_concatenable_sequence_from_iterable__"):
    optimize_for = type(start)
    l_result = list()
    l_result.extend(start)
    try:
      while True:
        item = next(it)
        if type(item) != optimize_for:
          start = optimize_for.__init_concatenable_sequence_from_iterable__(l_result)
          start = start + item
          break
        l_result.extend(item)
    except StopIteration:
      return optimize_for.__init_concatenable_sequence_from_iterable__(l_result)

In that case every type would be able to implement an optional
__init_concatenable_sequence_from_iterable__ method to benefit from
optimized sum(). If they won't do that sum() would use general code.
And of course it would be trivial to implement it for e.g. tuples.
Is that what you want?

> Today, sum is not the best way to concatenate sequences. Making it
> work better for some sequences but not others would mean it's still
> not the best way to concatenate sequences, but it would _appear_ to
> be. That's the very definition of an attractive nuisance.

Today python is not the best language in the world, because it still
has bugs. Fixing some bugs but not others would mean that it's still
not the best language in the world, but it would _appear_ to be. So,
let's never fix any bugs?

Don't you think that your logic is flawed? ;)

>>> Using a global variable (or a class attribute, which is the
>>> same thing) means that sum isn't reentrant, or thread-safe, or
>>> generator-safe.
>> 
>> Is it now? No? Then what changes?
> 
> Yes, it is now. So that's what changes.

sum() uses Py_DECREF. Py_DECREF is not thread-safe. It means that
sum() is not thread safe too. Where am I wrong?

> Again, this is something general and basic--operations that use
> global variables are not reentrant--and I can't tell whether you're
> being deliberately obtuse or whether you really don't understand that.

Well, if you don't want to use a global variable ­— don't use it. :)



More information about the Python-ideas mailing list