[Python-Dev] sum(...) limitation
chris.barker at noaa.gov
Thu Aug 14 02:10:34 CEST 2014
On Tue, Aug 12, 2014 at 11:21 PM, Stephen J. Turnbull <stephen at xemacs.org>
> Redirecting to python-ideas, so trimming less than I might.
reasonable enough -- you are introducing some more significant ideas for
I've said all I have to say about this -- I don't seem to see anything
encouraging form core devs, so I guess that's it.
Thanks for the fun bike-shedding...
> Chris Barker writes:
> > On Mon, Aug 11, 2014 at 11:07 PM, Stephen J. Turnbull <
> stephen at xemacs.org>
> > wrote:
> > > I'm referring to removing the unnecessary information that there's a
> > > better way to do it, and simply raising an error (as in Python 3.2,
> > > say) which is all a RealProgrammer[tm] should ever need!
> > >
> > I can't imagine anyone is suggesting that -- disallow it, but don't tell
> > anyone why?
> As I said, it's a regression. That's exactly the behavior in Python 3.2.
> > The only thing that is remotely on the table here is:
> > 1) remove the special case for strings -- buyer beware -- but consistent
> > and less "ugly"
> It's only consistent if you believe that Python has strict rules for
> use of various operators. It doesn't, except as far as they are
> constrained by precedence. For example, I have an application where I
> add bytestrings bytewise modulo N <= 256, and concatenate them. In
> fact I use function call syntax, but the obvious operator syntax is
> '+' for the bytewise addition, and '*' for the concatenation.
> It's not in the Zen, but I believe in the maxim "If it's worth doing,
> it's worth doing well." So for me, 1) is out anyway.
> > 2) add a special case for strings that is fast and efficient -- may be
> > simple as calling "".join() under the hood --no more code than the
> > exception check.
> Sure, but what about all the other immutable containers with __add__
> methods? What about mappings with key-wise __add__ methods whose
> values might be immutable but have __add__ methods? Where do you stop
> with the special-casing? I consider this far more complex and ugly
> than the simple "sum() is for numbers" rule (and even that is way too
> complex considering accuracy of summing floats).
> > And I doubt anyone really is pushing for anything but (2)
> I know that, but I think it's the wrong solution to the problem (which
> is genuine IMO). The right solution is something generic, possibly a
> __sum__ method. The question is whether that leads to too much work
> to be worth it (eg, "homogeneous_iterable").
> > > Because obviously we'd want the attractive nuisance of "if you
> > > have __add__, there's a default definition of __sum__"
> > now I'm confused -- isn't that exactly what we have now?
> Yes and my feeling (backed up by arguments that I admit may persuade
> nobody but myself) is that what we have now kinda sucks[tm]. It
> seemed like a good idea when I first saw it, but then, my apps don't
> scale to where the pain starts in my own usage.
> > > It's possible that Python could provide some kind of feature that
> > > would allow an optimized sum function for every type that has
> > > __add__, but I think this will take a lot of thinking.
> > does it need to be every type? As it is the common ones work fine
> > except for strings -- so if we add an optimized string sum() then we're
> > done.
> I didn't say provide an optimized sum(), I said provide a feature
> enabling people who want to optimize sum() to do so. So yes, it needs
> to be every type (the optional __sum__ method is a proof of concept,
> modulo it actually being implementable ;-).
> > > *Somebody* will do it (I don't think anybody is +1 on restricting
> > > sum() to a subset of types with __add__).
> > uhm, that's exactly what we have now
> Exactly. Who's arguing that the sum() we have now is a ticket to
> Paradise? I'm just saying that there's probably somebody out there
> negative enough on the current situation to come up with an answer
> that I think is general enough (and I suspect that python-dev
> consensus is that demanding, too).
> > sum() can be used for any type that has an __add__ defined.
> I'd like to see that be mutable types with __iadd__.
> > What I fail to see is why it's better to raise an exception and
> > point users to a better way, than to simply provide an optimization
> > so that it's a mute issue.
> Because inefficient sum() is an attractive nuisance, easy to overlook,
> and likely to bite users other than the author.
> > The only justification offered here is that will teach people that
> > strings (and some other objects?)
> Summing tuples works (with appropriate start=tuple()). Haven't
> benchmarked, but I bet that's O(N^2).
> > is order(N^2) and a bad idea. But:
> > a) Python's primary purpose is practical, not pedagogical (not that it
> > isn't great for that)
> My argument is that in practical use sum() is a bad idea, period,
> until you book up on the types and applications where it *does* work.
> N.B. It doesn't even work properly for numbers (inaccurate for floats).
> > b) I doubt any naive users learn anything other than "I can't use sum()
> > strings, I should use "".join()".
> For people who think that special-casing strings is a good idea, I
> think this is about as much benefit as you can expect. Why go
> farther?<0.5 wink/>
> > I submit that no naive user is going to get any closer to a proper
> > understanding of algorithmic Order behavior from this small hint. Which
> > leaves no reason to prefer an Exception to an optimization.
> TOOWTDI. str.join is in pretty much every code base by now, and
> tutorials and FAQs recommending its user and severely deprecating sum
> for strings are legion.
> > One other point: perhaps this will lead a naive user into thinking --
> > "sum() raises an exception if I try to use it inefficiently, so it must
> > OK to use for anything that doesn't raise an exception" -- that would
> be a
> > bad lesson to mis-learn....
> That assumes they know about the start argument. I think most naive
> users will just try to sum a bunch of tuples, and get the "can't add
> 0, tuple" Exception and write a loop. I suspect that many of the
> users who get the "use str.join" warning along with the Exception are
> unaware of the start argument, too. They expect sum(iter_of_str) to
> magically add the strings. Ie, when in 3.2 they got the
> uninformative "can't add 0, str" message, they did not immediately go
> "d'oh" and insert ", start=''" in the call to sum, they wrote a loop.
> > while we are at it, having the default sum() for floats be fsum()
> > would be nice
> How do you propose to implement that, given math.fsum is perfectly
> happy to sum integers? You can't just check one or a few leading
> elements for floatiness. I think you have to dispatch on type(start),
> but then sum(iter_of_floats) DTWT. So I would suggest changing the
> signature to sum(it, start=0.0). This would probably be acceptable to
> most users with iterables of ints, but does imply some performance hit.
> > This does turn sum() into a function that does type-based dispatch,
> > but isn't python full of those already? do something special for
> > the types you know about, call the generic dunder method for the
> > rest.
> AFAIK Python is moving in the opposite direction: if there's a common
> need for dispatching to type-specific implementations of a method,
> define a standard (not "generic") dunder for the purpose, and have the
> builtin (or operator, or whatever) look up (not "call") the
> appropriate instance in the usual way, then call it. If there's a
> useful generic implementation, define an ABC to inherit from that
> provides that generic implementation.
Christopher Barker, Ph.D.
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-Dev