[Python-Dev] sum(...) limitation

Thu Aug 14 02:10:34 CEST 2014

On Tue, Aug 12, 2014 at 11:21 PM, Stephen J. Turnbull <stephen at xemacs.org>
wrote:

> Redirecting to python-ideas, so trimming less than I might.

reasonable enough -- you are introducing some more significant ideas for
changes.

I've said all I have to say about this -- I don't seem to see anything
encouraging form core devs, so I guess that's it.

Thanks for the fun bike-shedding...

-Chris

> Chris Barker writes:
>  > On Mon, Aug 11, 2014 at 11:07 PM, Stephen J. Turnbull <
> stephen at xemacs.org>
>  > wrote:
>  >
>  > > I'm referring to removing the unnecessary information that there's a
>  > >  better way to do it, and simply raising an error (as in Python 3.2,
>  > > say) which is all a RealProgrammer[tm] should ever need!
>  > >
>  >
>  > I can't imagine anyone is suggesting that -- disallow it, but don't tell
>  > anyone why?
>
> As I said, it's a regression.  That's exactly the behavior in Python 3.2.
>
>  > The only thing that is remotely on the table here is:
>  >
>  > 1) remove the special case for strings -- buyer beware -- but consistent
>  > and less "ugly"
>
> It's only consistent if you believe that Python has strict rules for
> use of various operators.  It doesn't, except as far as they are
> constrained by precedence.  For example, I have an application where I
> add bytestrings bytewise modulo N <= 256, and concatenate them.  In
> fact I use function call syntax, but the obvious operator syntax is
> '+' for the bytewise addition, and '*' for the concatenation.
>
> It's not in the Zen, but I believe in the maxim "If it's worth doing,
> it's worth doing well."  So for me, 1) is out anyway.
>
>  > 2) add a special case for strings that is fast and efficient -- may be
> as
>  > simple as calling "".join() under the hood --no more code than the
>  > exception check.
>
> Sure, but what about all the other immutable containers with __add__
> methods?  What about mappings with key-wise __add__ methods whose
> values might be immutable but have __add__ methods?  Where do you stop
> with the special-casing?  I consider this far more complex and ugly
> than the simple "sum() is for numbers" rule (and even that is way too
> complex considering accuracy of summing floats).
>
>  > And I doubt anyone really is pushing for anything but (2)
>
> I know that, but I think it's the wrong solution to the problem (which
> is genuine IMO).  The right solution is something generic, possibly a
> __sum__ method.  The question is whether that leads to too much work
> to be worth it (eg, "homogeneous_iterable").
>
>  > > Because obviously we'd want the attractive nuisance of "if you
>  > > have __add__, there's a default definition of __sum__"
>  >
>  > now I'm confused -- isn't that exactly what we have now?
>
> Yes and my feeling (backed up by arguments that I admit may persuade
> nobody but myself) is that what we have now kinda sucks[tm].  It
> seemed like a good idea when I first saw it, but then, my apps don't
> scale to where the pain starts in my own usage.
>
>  > > It's possible that Python could provide some kind of feature that
>  > > would allow an optimized sum function for every type that has
>  > > __add__, but I think this will take a lot of thinking.
>  >
>  > does it need to be every type? As it is the common ones work fine
> already
>  > except for strings -- so if we add an optimized string sum() then we're
>  > done.
>
> I didn't say provide an optimized sum(), I said provide a feature
> enabling people who want to optimize sum() to do so.  So yes, it needs
> to be every type (the optional __sum__ method is a proof of concept,
> modulo it actually being implementable ;-).
>
>  > > *Somebody* will do it (I don't think anybody is +1 on restricting
>  > > sum() to a subset of types with __add__).
>  >
>  > uhm, that's exactly what we have now
>
> Exactly.  Who's arguing that the sum() we have now is a ticket to
> Paradise?  I'm just saying that there's probably somebody out there
> negative enough on the current situation to come up with an answer
> that I think is general enough (and I suspect that python-dev
> consensus is that demanding, too).
>
>  > sum() can be used for any type that has an __add__ defined.
>
> I'd like to see that be mutable types with __iadd__.
>
>  > What I fail to see is why it's better to raise an exception and
>  > point users to a better way, than to simply provide an optimization
>  > so that it's a mute issue.
>
> Because inefficient sum() is an attractive nuisance, easy to overlook,
> and likely to bite users other than the author.
>
>  > The only justification offered here is that will teach people that
> summing
>  > strings (and some other objects?)
>
> Summing tuples works (with appropriate start=tuple()).  Haven't
> benchmarked, but I bet that's O(N^2).
>
>  > is order(N^2) and a bad idea. But:
>  >
>  > a) Python's primary purpose is practical, not pedagogical (not that it
>  > isn't great for that)
>
> My argument is that in practical use sum() is a bad idea, period,
> until you book up on the types and applications where it *does* work.
> N.B. It doesn't even work properly for numbers (inaccurate for floats).
>
>  > b) I doubt any naive users learn anything other than "I can't use sum()
> for
>  > strings, I should use "".join()".
>
> For people who think that special-casing strings is a good idea, I
> think this is about as much benefit as you can expect.  Why go
> farther?<0.5 wink/>
>
>  > I submit that no naive user is going to get any closer to a proper
>  > understanding of algorithmic Order behavior from this small hint. Which
>  > leaves no reason to prefer an Exception to an optimization.
>
> TOOWTDI.  str.join is in pretty much every code base by now, and
> tutorials and FAQs recommending its user and severely deprecating sum
> for strings are legion.
>
>  > One other point: perhaps this will lead a naive user into thinking --
>  > "sum() raises an exception if I try to use it inefficiently, so it must
> be
>  > OK to use for anything that doesn't raise an exception" -- that would
> be a
>  > bad lesson to mis-learn....
>
> That assumes they know about the start argument.  I think most naive
> users will just try to sum a bunch of tuples, and get the "can't add
> 0, tuple" Exception and write a loop.  I suspect that many of the
> users who get the "use str.join" warning along with the Exception are
> unaware of the start argument, too.  They expect sum(iter_of_str) to
> magically add the strings.  Ie, when in 3.2 they got the
> uninformative "can't add 0, str" message, they did not immediately go
> "d'oh" and insert ", start=''" in the call to sum, they wrote a loop.
>
>  > while we are at it, having the default sum() for floats be fsum()
>  > would be nice
>
> How do you propose to implement that, given math.fsum is perfectly
> happy to sum integers?  You can't just check one or a few leading
> elements for floatiness.  I think you have to dispatch on type(start),
> but then sum(iter_of_floats) DTWT.  So I would suggest changing the
> signature to sum(it, start=0.0).  This would probably be acceptable to
> most users with iterables of ints, but does imply some performance hit.
>
>  > This does turn sum() into a function that does type-based dispatch,
>  > but isn't python full of those already? do something special for
>  > the types you know about, call the generic dunder method for the
>  > rest.
>
> AFAIK Python is moving in the opposite direction: if there's a common
> need for dispatching to type-specific implementations of a method,
> define a standard (not "generic") dunder for the purpose, and have the
> builtin (or operator, or whatever) look up (not "call") the
> appropriate instance in the usual way, then call it.  If there's a
> useful generic implementation, define an ABC to inherit from that
> provides that generic implementation.
>
>

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140813/b3a11f7f/attachment.html>