On 12 Aug 2014 03:03, "Chris Barker - NOAA Federal" <chris.barker@noaa.gov> wrote:
>
> My confusion is still this:
>
> Repeated summation of strings has been optimized in cpython even
> though it's not the recommended way to solve that problem.
The quadratic behaviour of repeated str summation is a subtle, silent error. It *is* controversial that CPython silently optimises some cases of it away, since it can cause problems when porting affected code to other interpreters that don't use refcounting and thus have a harder time implementing such a trick.
It's considered worth the cost, since it dramatically improves the performance of common naive code in a way that doesn't alter the semantics.
> So why not special case optimize sum() for strings? We are already
> special-case strings to raise an exception.
>
> It seems pretty pedantic to say: we cod make this work well, but we'd
> rather chide you for not knowing the "proper" way to do it.
Yes, that's exactly what this is - a nudge towards the right way to concatenate strings without incurring quadratic behaviour. We *want* people to learn that distinction, not sweep it under the rug. That's the other reason the implicit optimisation is controversial - it hides an important difference in algorithmic complexity from users.
> Practicality beats purity?
Teaching users the difference between linear time operations and quadratic ones isn't about purity, it's about passing along a fundamental principle of algorithm scalability.
We do it specifically for strings because they *do* have an optimised algorithm available that we can point users towards, and concatenating multiple strings is common.
Other containers don't tend to be concatenated like that in the first place, so there's no such check pushing other iterables towards itertools.chain.
Regards,
Nick.
>
> -Chris
>
>
>
>
> > Although that's not the whole story: in
> > practice even numerical sums get split into multiple functions because
> > floating point addition isn't associative, and so needs careful
> > treatment to preserve accuracy. At that point I'm strongly +1 on
> > abandoning attempts to "rationalize" summation.
> >
> > I'm not sure how I'd feel about raising an exception if you try to sum
> > any iterable containing misbehaved types like float. But not only
> > would that be a Python 4 effort due to backward incompatibility, but
> > it sorta contradicts the main argument of proponents ("any type
> > implementing __add__ should be sum()-able").
> >
> > _______________________________________________
> > Python-Dev mailing list
> > Python-Dev@python.org
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe: https://mail.python.org/mailman/options/python-dev/chris.barker%40noaa.gov
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com