<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, Aug 12, 2014 at 11:21 PM, Stephen J. Turnbull <span dir="ltr"><<a href="mailto:stephen@xemacs.org" target="_blank">stephen@xemacs.org</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Redirecting to python-ideas, so trimming less than I might.</blockquote><div><br></div><div>reasonable enough -- you are introducing some more significant ideas for changes.</div>
<div> </div><div>I've said all I have to say about this -- I don't seem to see anything encouraging form core devs, so I guess that's it.</div><div><br></div><div>Thanks for the fun bike-shedding...</div><div>
<br></div><div>-Chris</div><div><br></div><div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="">
Chris Barker writes:<br>
> On Mon, Aug 11, 2014 at 11:07 PM, Stephen J. Turnbull <<a href="mailto:stephen@xemacs.org">stephen@xemacs.org</a>><br>
> wrote:<br>
><br>
> > I'm referring to removing the unnecessary information that there's a<br>
> > better way to do it, and simply raising an error (as in Python 3.2,<br>
> > say) which is all a RealProgrammer[tm] should ever need!<br>
> ><br>
><br>
> I can't imagine anyone is suggesting that -- disallow it, but don't tell<br>
> anyone why?<br>
<br>
</div>As I said, it's a regression. That's exactly the behavior in Python 3.2.<br>
<div class=""><br>
> The only thing that is remotely on the table here is:<br>
><br>
> 1) remove the special case for strings -- buyer beware -- but consistent<br>
> and less "ugly"<br>
<br>
</div>It's only consistent if you believe that Python has strict rules for<br>
use of various operators. It doesn't, except as far as they are<br>
constrained by precedence. For example, I have an application where I<br>
add bytestrings bytewise modulo N <= 256, and concatenate them. In<br>
fact I use function call syntax, but the obvious operator syntax is<br>
'+' for the bytewise addition, and '*' for the concatenation.<br>
<br>
It's not in the Zen, but I believe in the maxim "If it's worth doing,<br>
it's worth doing well." So for me, 1) is out anyway.<br>
<div class=""><br>
> 2) add a special case for strings that is fast and efficient -- may be as<br>
> simple as calling "".join() under the hood --no more code than the<br>
> exception check.<br>
<br>
</div>Sure, but what about all the other immutable containers with __add__<br>
methods? What about mappings with key-wise __add__ methods whose<br>
values might be immutable but have __add__ methods? Where do you stop<br>
with the special-casing? I consider this far more complex and ugly<br>
than the simple "sum() is for numbers" rule (and even that is way too<br>
complex considering accuracy of summing floats).<br>
<div class=""><br>
> And I doubt anyone really is pushing for anything but (2)<br>
<br>
</div>I know that, but I think it's the wrong solution to the problem (which<br>
is genuine IMO). The right solution is something generic, possibly a<br>
__sum__ method. The question is whether that leads to too much work<br>
to be worth it (eg, "homogeneous_iterable").<br>
<div class=""><br>
> > Because obviously we'd want the attractive nuisance of "if you<br>
> > have __add__, there's a default definition of __sum__"<br>
><br>
> now I'm confused -- isn't that exactly what we have now?<br>
<br>
</div>Yes and my feeling (backed up by arguments that I admit may persuade<br>
nobody but myself) is that what we have now kinda sucks[tm]. It<br>
seemed like a good idea when I first saw it, but then, my apps don't<br>
scale to where the pain starts in my own usage.<br>
<div class=""><br>
> > It's possible that Python could provide some kind of feature that<br>
> > would allow an optimized sum function for every type that has<br>
> > __add__, but I think this will take a lot of thinking.<br>
><br>
> does it need to be every type? As it is the common ones work fine already<br>
> except for strings -- so if we add an optimized string sum() then we're<br>
> done.<br>
<br>
</div>I didn't say provide an optimized sum(), I said provide a feature<br>
enabling people who want to optimize sum() to do so. So yes, it needs<br>
to be every type (the optional __sum__ method is a proof of concept,<br>
modulo it actually being implementable ;-).<br>
<div class=""><br>
> > *Somebody* will do it (I don't think anybody is +1 on restricting<br>
> > sum() to a subset of types with __add__).<br>
><br>
> uhm, that's exactly what we have now<br>
<br>
</div>Exactly. Who's arguing that the sum() we have now is a ticket to<br>
Paradise? I'm just saying that there's probably somebody out there<br>
negative enough on the current situation to come up with an answer<br>
that I think is general enough (and I suspect that python-dev<br>
consensus is that demanding, too).<br>
<div class=""><br>
> sum() can be used for any type that has an __add__ defined.<br>
<br>
</div>I'd like to see that be mutable types with __iadd__.<br>
<div class=""><br>
> What I fail to see is why it's better to raise an exception and<br>
> point users to a better way, than to simply provide an optimization<br>
> so that it's a mute issue.<br>
<br>
</div>Because inefficient sum() is an attractive nuisance, easy to overlook,<br>
and likely to bite users other than the author.<br>
<div class=""><br>
> The only justification offered here is that will teach people that summing<br>
> strings (and some other objects?)<br>
<br>
</div>Summing tuples works (with appropriate start=tuple()). Haven't<br>
benchmarked, but I bet that's O(N^2).<br>
<div class=""><br>
> is order(N^2) and a bad idea. But:<br>
><br>
> a) Python's primary purpose is practical, not pedagogical (not that it<br>
> isn't great for that)<br>
<br>
</div>My argument is that in practical use sum() is a bad idea, period,<br>
until you book up on the types and applications where it *does* work.<br>
N.B. It doesn't even work properly for numbers (inaccurate for floats).<br>
<div class=""><br>
> b) I doubt any naive users learn anything other than "I can't use sum() for<br>
> strings, I should use "".join()".<br>
<br>
</div>For people who think that special-casing strings is a good idea, I<br>
think this is about as much benefit as you can expect. Why go<br>
farther?<0.5 wink/><br>
<div class=""><br>
> I submit that no naive user is going to get any closer to a proper<br>
> understanding of algorithmic Order behavior from this small hint. Which<br>
> leaves no reason to prefer an Exception to an optimization.<br>
<br>
</div>TOOWTDI. str.join is in pretty much every code base by now, and<br>
tutorials and FAQs recommending its user and severely deprecating sum<br>
for strings are legion.<br>
<div class=""><br>
> One other point: perhaps this will lead a naive user into thinking --<br>
> "sum() raises an exception if I try to use it inefficiently, so it must be<br>
> OK to use for anything that doesn't raise an exception" -- that would be a<br>
> bad lesson to mis-learn....<br>
<br>
</div>That assumes they know about the start argument. I think most naive<br>
users will just try to sum a bunch of tuples, and get the "can't add<br>
0, tuple" Exception and write a loop. I suspect that many of the<br>
users who get the "use str.join" warning along with the Exception are<br>
unaware of the start argument, too. They expect sum(iter_of_str) to<br>
magically add the strings. Ie, when in 3.2 they got the<br>
uninformative "can't add 0, str" message, they did not immediately go<br>
"d'oh" and insert ", start=''" in the call to sum, they wrote a loop.<br>
<div class=""><br>
> while we are at it, having the default sum() for floats be fsum()<br>
> would be nice<br>
<br>
</div>How do you propose to implement that, given math.fsum is perfectly<br>
happy to sum integers? You can't just check one or a few leading<br>
elements for floatiness. I think you have to dispatch on type(start),<br>
but then sum(iter_of_floats) DTWT. So I would suggest changing the<br>
signature to sum(it, start=0.0). This would probably be acceptable to<br>
most users with iterables of ints, but does imply some performance hit.<br>
<div class=""><br>
> This does turn sum() into a function that does type-based dispatch,<br>
> but isn't python full of those already? do something special for<br>
> the types you know about, call the generic dunder method for the<br>
> rest.<br>
<br>
</div>AFAIK Python is moving in the opposite direction: if there's a common<br>
need for dispatching to type-specific implementations of a method,<br>
define a standard (not "generic") dunder for the purpose, and have the<br>
builtin (or operator, or whatever) look up (not "call") the<br>
appropriate instance in the usual way, then call it. If there's a<br>
useful generic implementation, define an ABC to inherit from that<br>
provides that generic implementation.<br>
<br>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><br>Christopher Barker, Ph.D.<br>Oceanographer<br><br>Emergency Response Division<br>NOAA/NOS/OR&R (206) 526-6959 voice<br>7600 Sand Point Way NE (206) 526-6329 fax<br>
Seattle, WA 98115 (206) 526-6317 main reception<br><br><a href="mailto:Chris.Barker@noaa.gov" target="_blank">Chris.Barker@noaa.gov</a>
</div></div>