[Python-ideas] Add nullifier argument to functools.reduce?

Andrew Barnert abarnert at yahoo.com
Sun Aug 24 22:47:46 CEST 2014


I think it would be both simpler and more useful to define accumulate_with_attractors, and then define reduce_with_attractors as a wrapper around that.

Assuming you're doing a lot of this kind of stuff, you're either going to be using a library like more_itertools or pytoolz or your own custom library, so I'll assume you already have a "takewhileplus" that's like takewhile but also yields the first failing value, and an "ilast" that returns the last value in an iterable and raises a TypeError if empty (maybe with an optional default argument, but I'm not going to use it).

_sentinel = object()

def accumulate_with_attractors(iterable, func=operator.add,
                               stop_on=(), end_if=None):
    yield from takewhileplus(lambda x: x not in stop_on and not (end_if and end_if(x)),
                             itertools.accumulate(iterable, func))

def reduce_with_attractors(func, iterable, start=_sentinel,
                           stop_on=(), end_if=None):
    if start is not _sentinel:
        iterable = itertools.chain([start], iterable)
    return ilast(accumulate_with_attractors(iterable, func, stop_on, end_if))

You can easily optimize this in a few ways. You can use a takeuntilplus so the lambda doesn't have to negate its condition, you can use stop_on.__contains__ if end_if isn't passed, etc. With those optimizations, using C implementations of takeuntilplus and ilast, this takes about 40% as long as your version.

With no optimizations, using Python takewhileplus and last, it takes about 140% as long. But I think the added simplicity, the fewer edge cases to get wrong, and the fact that you get a usable accumulate_with_attractors out of it makes it a worthwhile tradeoff even at 40% slower.


On Sunday, August 24, 2014 10:27 AM, David Mertz <mertz at gnosis.cx> wrote:


>
>
>So here's a version of my function that addresses Steven's points, and also adds what I think is more useful behavior.  A couple points about my short--but more than 3-line function.
>
>* The notion of "attractor" here still follows the "we're all adults" philosophy. That is, the 'stop_on' and 'end_if' arguments may very well not actually specify genuine attractors, but simply "satisfaction" values.  In the examples below these are completely unlike attractors, but this could be useful also for something like numeric approximation algorithms where some non-final reduction value is considered "good enough" anyway.
>
>* To be more like the reduce() builtin, I raise TypeError on an empty iterable with no 'start' value.
>
>* Although it's not as cute in using itertools as much, but simply a loop-with-test, I got rid of the "golf" artifact of potentially allocating a large, useless list.
>
>* Most interesting, I think, I separated a 'stop_on' collection of values from a 'end_if' predicate.  This lets us generalize attractors, I believe.  For a simple cyclic attractor, you could list several value in the 'stop_on' collection.  But even for a strange attractor (or simply a complex one, or e.g. an attractor to a continuous set of values--think an attractor to a circular, real-valued orbit), the predicate could, in principle, capture the fact we reached the attractor.
>
>#################################################
>% cat reduce_with_attractor.py
>#!/usr/bin/env python3
>from itertools import *
>from operator import add
>
>def reduce_with_attractors(func, it, start=None, stop_on=(), end_if=None):
>    it = iter(it)
>    try:
>        start = start if start!=None else next(it)
>    except StopIteration:
>        raise TypeError
>    for x in accumulate(chain([start],it), func):
>        if x in stop_on or (end_if and end_if(x)):
>            break
>    return x
>
>print(reduce_with_attractors(add, range(100), stop_on=(2, 6, 17)))
>print(reduce_with_attractors(add, range(100), stop_on=('foo','bar')))
>print(reduce_with_attractors(add, range(100), end_if=lambda x: x>100))
>#################################################
>
>
>On Sat, Aug 23, 2014 at 7:19 PM, David Mertz <mertz at gnosis.cx> wrote:
>>
>> On Sat, Aug 23, 2014 at 6:53 PM, Steven D'Aprano <steve at pearwood.info> wrote:
>>>
>>> On Sat, Aug 23, 2014 at 11:43:42AM -0700, David Mertz wrote:
>>>
>>> >   def reduce_with_attractor(func, it, start=None, end_if=None):
>>> >       it = iter(it)
>>> >       start = start if start!=None else it.__next__()
>>> >       return list(takewhile(lambda x: x!=end_if,
>>> >                             accumulate(chain([start],it), func)))[-1]
>>>
>>> A couple of points:
>>>
>>> - Don't use it.__next__(), use the built-in next(it).
>>
>>
>> Yeah, good point.
>>  
>>>
>>> - To match the behaviour of reduce, you need to catch the StopIteration
>>> and raise TypeError.
>>
>>
>> Oh, OK.  Hadn't thought of that.
>>  
>>>
>>> - Your implementation eagerly generates a potentially enormous list of
>>> intermediate results, which strikes me as somewhat unfortunate for a
>>> functional tool like reduce. In other words, for some inputs, this is
>>> going to perform like a dog, generating a HUGE list up front, then
>>> throwing it all away except for the final value.
>>
>>
>> I know.  I realized this flaw right away.  I was trying to be cute and fit it in my promised 3 lines.  It would be better to put it in a loop to realize the successive values, of course--but would take an extra line or two.  Maybe there's a way to squeeze it in one line with itertools rather than a regular loop though.
>>  
>>>
>>> > This gives you the accumulation up-to-but-not-including the attractor.  I
>>> > guess the OP wanted to return the attractor itself (although that seems
>>> > slightly less useful to me).
>>>
>>> Not really. His use-case seems to be to short-cut a lot of unnecessary
>>> calculations, e.g. suppose you write product() as reduce(operator.mul,
>>> iterable). In the event that the product reaches zero, you can[1]
>>> short-circuit the rest of the iterable and just return 0:
>>>
>>> product([1, 2, 3, 0] + [4]*1000000)
>>>
>>> ought to reduce 0, not 6, and the intent is for it to do so *quickly*,
>>> ignoring the 4s at the end of the list.
>>
>>
>> I guess that's true.  Although I can certainly imagine being interested not only in the final attractor, but *that* it reached an attractor and ended early.  Not sure how best to signal that.  Actually, now that I think of it, it would be kinda nice to make the function 'reduce_with_attractorS()' instead, and allow specification of multiple attractors.
>>
>> I welcome your improved version of the code :-).  Feel free to take a whole 10 lines to do it right.
>>  
>>>
>>> [1] Actually you can't. 0 is no longer an attractor in the presence of
>>> INF or NAN.
>>
>>
>> I was sort of thinking of a "we're all adults here" attitude.  That is, the "attractor" might not really be a genuine attractor, but we still trust the caller to say it is.  I.e. my function would accept this call:
>>
>>     reduce_with_attractor(operator.mul, range(1,1e6), end_if=6))
>>
>> I'm making the claim that reaching '6' is a stopping point... which, well it is.  No, it's not an actual attractor, but maybe a caller really does want to stop iterating if it gets to that value anyway.  Hence 'end_if' is actually an accurate name.
>>
>> --
>> Keeping medicines from the bloodstreams of the sick; food
>> from the bellies of the hungry; books from the hands of the
>> uneducated; technology from the underdeveloped; and putting
>> advocates of freedom in prisons.  Intellectual property is
>> to the 21st century what the slave trade was to the 16th.
>
>
>
>
>
>--
>Keeping medicines from the bloodstreams of the sick; food
>from the bellies of the hungry; books from the hands of the
>uneducated; technology from the underdeveloped; and putting
>advocates of freedom in prisons.  Intellectual property is
>to the 21st century what the slave trade was to the 16th.
>
>_______________________________________________
>Python-ideas mailing list
>Python-ideas at python.org
>https://mail.python.org/mailman/listinfo/python-ideas
>Code of Conduct: http://python.org/psf/codeofconduct/
>
>


More information about the Python-ideas mailing list