I know, I have nothing to decide here, since I'm no contributer and just a silent watcher on this list.
However I just wanted to point out I fully agree with Chris Barker's position. Couldn't have stated
it better. Performance should be interpreter implementation issue, not language issue.
 
> 2) add a special case for strings that is fast and efficient -- may be as simple as calling "".join() under the hood --no more code than the exception check.
I would give it a +1 if my opinion counts anything.
 
Cheers
 
Stefan
 
 
Gesendet: Dienstag, 12. August 2014 um 21:11 Uhr
Von: "Chris Barker" <chris.barker@noaa.gov>
An: Kein Empfänger
Cc: "Python Dev" <python-dev@python.org>
Betreff: Re: [Python-Dev] sum(...) limitation
On Mon, Aug 11, 2014 at 11:07 PM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
I'm referring to removing the unnecessary information that there's a
better way to do it, and simply raising an error (as in Python 3.2,
say) which is all a RealProgrammer[tm] should ever need!
 
I can't imagine anyone is suggesting that -- disallow it, but don't tell anyone why? 
 
The only thing that is remotely on the table here is:
 
1) remove the special case for strings -- buyer beware -- but consistent and less "ugly"
 
2) add a special case for strings that is fast and efficient -- may be as simple as calling "".join() under the hood --no more code than the exception check.
 
And I doubt anyone really is pushing for anything but (2)
 
Steven Turnbull wrote:
  IMO we'd also want a homogeneous_iterable ABC
 
Actually, I've thought for years that that would open the door to a lot of optimizations -- but that's a much broader question that sum(). I even brought it up probably over ten years ago -- but no one was the least bit iinterested -- nor are they now -- I now this was a rhetorical suggestion to make the point about what not to do....
 
  Because obviously we'd want the
attractive nuisance of "if you have __add__, there's a default
definition of __sum__" 
 
now I'm confused -- isn't that exactly what we have now?
 
It's possible that Python could provide some kind of feature that
would allow an optimized sum function for every type that has __add__,
but I think this will take a lot of thinking.
 
does it need to be every type? As it is the common ones work fine already except for strings -- so if we add an optimized string sum() then we're done.
 
 *Somebody* will do it
(I don't think anybody is +1 on restricting sum() to a subset of types
with __add__). 
 
uhm, that's exactly what we have now -- you can use sum() with anything that has an __add__, except strings. Ns by that logic, if we thought there were other inefficient use cases, we'd restrict those too.
 
But users can always define their own classes that have a __sum__ and are really inefficient -- so unless sum() becomes just for a certain subset of built-in types -- does anyone want that? Then we are back to the current situation:
 
sum() can be used for any type that has an __add__ defined.
 
But naive users are likely to try it with strings, and that's bad, so we want to prevent that, and have a special case check for strings.
 
What I fail to see is why it's better to raise an exception and point users to a better way, than to simply provide an optimization so that it's a mute issue.
 
The only justification offered here is that will teach people that summing strings (and some other objects?) is order(N^2) and a bad idea. But:
 
a) Python's primary purpose is practical, not pedagogical (not that it isn't great for that)
 
b) I doubt any naive users learn anything other than "I can't use sum() for strings, I should use "".join()". Will they make the leap to "I shouldn't use string concatenation in a loop, either"? Oh, wait, you can use string concatenation in a loop -- that's been optimized. So will they learn: "some types of object shave poor performance with repeated concatenation and shouldn't be used with sum(). So If I write such a class, and want to sum them up, I'll need to write an optimized version of that code"?
 
I submit that no naive user is going to get any closer to a proper understanding of algorithmic Order behavior from this small hint. Which leaves no reason to prefer an Exception to an optimization.
 
One other point: perhaps this will lead a naive user into thinking -- "sum() raises an exception if I try to use it inefficiently, so it must be OK to use for anything that doesn't raise an exception" -- that would be a bad lesson to mis-learn....
 
-Chris
 
PS: 
Armin Rigo wrote:
It also improves a
lot the precision of sum(list_of_floats) (though not reaching the same
precision levels of math.fsum()).
 
while we are at it, having the default sum() for floats be fsum() would be nice -- I'd rather the default was better accuracy loser performance. Folks that really care about performance could call math.fastsum(), or really, use numpy...
 
This does turn sum() into a function that does type-based dispatch, but isn't python full of those already? do something special for the types you know about, call the generic dunder method for the rest.
 
 
 
--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker@noaa.gov
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/stefan.richthofer%40gmx.de