[Python-ideas] Fast sum() for non-numbers

Ron Adam ron3200 at gmail.com
Tue Jul 9 16:39:38 CEST 2013



On 07/09/2013 03:51 AM, Andrew Barnert wrote:
> From: Stephen J. Turnbull <stephen at xemacs.org>
>
> Sent: Monday, July 8, 2013 11:23 PM
>
>
>> Andrew Barnert writes:
>>
>>> Meanwhile sum is the obvious way to sum things that are obviously
>>> summable (numbers, matrices, etc.), and nothing else.
>>
>> My intuition matches yours, but I find this argument (and the rest of
>> the arguments that say that "generic sum() is unobvious and wrong")
>> logically unsatisfactory.  It would be nice if you could provide a
>> plausible definition of "summable" other than "__add__() is
>> implemented".  I don't have one. :-(
>
>
> As I see it, there are three possibilities.
>
> 1. sum is not appropriate when __add__ means concatenation rather than
> adding. If you'd use PySequence_Concat/sq_concat rather than
> PyNumber_Add/nb_add in porting to the C API, or if you'd use a different
> operator in Python 4 if concatenation stopped being written as __add__,
> then you shouldn't use sum. The problem is usually, you're not writing C
> API code or Python 4, you're writing Python 3, so it's not always
> obvious what the facts are. But I don't think it's ever that hard to
> figure out. If we used & for concatenation, list.__add__ and str.__add__
> would no longer exist, but np.matrix.__add__, datetime.__add__, and
> quaternion.__add__ would.

What if we could specify the degree of specialisation of an operator?

Kind of like a cast operation, but instead, it works on the operator 
instead of the value?

     v = a +(int) b             # v = int.__add__(a, b)
     v = a +(abc.Numbers) b     # ?

     v = a +(str) b             # v = str.__add__(a, b)
     v = a +(abc.Container) b   # ?


The default operation would be...

     v = a +(type(a)) b         #    v = type(a).__add__(a, b)
                                # or v = a.__add__(b)


I don't think ABC's are defined/refined enough to make this work as they 
are, but could they be?   (I'm not that familiar with ABC's yet.)

Ron


> 2. sum is not appropriate iff chain.from_iterable makes sense.* Needing
> a list doesn't make chain unusable here any more than it does with map,
> zip, etc.; just pass the result to list. But "You can't iterate ints" or
> "I'm treating these np.matrix 3-vectors as atomic objects, not
> collections" does mean chain is unusable, so sum is the answer.
>
> 3. sum is not appropriate when 0 doesn't make sense as a start value.
> Summing things means, by default, starting with 0 and adding repeatedly.
> You can provide a non-default start value, but it should be "similar to
> 0" or "compatible with 0" in some way. Note that you can add 0 to an
> int, a float, a quaternion, a numpy.matrix, and all kinds of other types
> that "act like numbers". And that means that testing 0+start or start+0
> is a pretty good test for summable. This is admittedly imperfect,** but
> it's pretty close, and very concrete and simple.
>
> Personally, I'm leaning toward 2. If you come up with a type that is
> addable, and isn't iterable (or does the wrong thing when iterated), why
> not, it's summable. Better to let some corner-case false positives slip
> in than the reject some corner-case false negatives (as with 3), or to
> make them just impossible to decide (as with 1).
>
>
> * This is a little too loose. Surely there are types where __add__ is
> not addition, but also not iterable concatenation, right? But I don't
> think sum is an attractive nuisance there, unlike in the case with
> concatenation. Meanwhile, what about strings? Actually, chain makes
> perfect sense with strings; it's just that usually, you're just going to
> want to pass the iterable to ''.join, and if you're doing that, you can
> just pass the original strings to ''.join. So, no problem there.
>
> ** Most notably, I think adding a sequence of timedelta objects with a
> datetime start makes sense, and you can't add 0 to a datetime. Really,
> what you need here is a way to say that start + 0 * peek(iterable) is
> also acceptable, not just start + 0—and you can justify that more
> rigorously in terms of fields—but that's nearly impossible to implement,
> and way too complicated to explain. So, option 3 will reject some valid
> types. _______________________________________________ Python-ideas
> mailing list Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



More information about the Python-ideas mailing list