[Python-ideas] Way to repeat other than "for _ in range(x)"

Nick Coghlan ncoghlan at gmail.com
Fri Mar 31 01:12:20 EDT 2017


On 31 March 2017 at 03:06, Markus Meskanen <markusmeskanen at gmail.com> wrote:
> And like I said before, for loop is just another way of doing while loop,
> yet nobody's complaining. There's nothing wrong with having two different
> ways of doing the same thing, as long as one of them is never the better
> way. If we add `repeat`, there's never a reason to use `for _ in range`
> anymore.
>
> What comes to your custom class solution, it's uglier, harder to follow, and
> way slower than just doing:
>
> d = [[0]*5 for _ in range(10)]
>
> While the proposed method would be faster, shorter, and cleaner.

Well, no, as regularly doing this suggests someone is attempting to
write C-in-Python rather than writing Python-in-Python.

While C is certainly Python's heritage (especially back in the days
before the iterator protocol, when "for i in range(len(container)):"
was still a recommended idiom), writing Python code using C idioms
isn't even close to being the recommended way of doing things today.

So when you say "I use the 'expr for __ in range(count)' pattern a
lot", we hear "I don't typically exploit first class functions and the
iterator protocol to their full power".

And that's fine as far as it goes - 'expr for __ in range(count)' is
perfectly acceptable code, and there's nothing wrong with it.

However, what it *doesn't* provide is adequate justification for
adding an entirely new construct to the language - given other
iterator protocol and first class function based tools like
enumerate(), itertools.repeat(), zip(), map(), etc, we don't want to
add a new non-composable form of iteration purely for the "repeat this
operation a known number of times" case.

To elaborate on that point, note that any comprehension can always be
reformulated as an iteration over a sequence of callables, in this
case:

    init = ([0]*5).copy
    d = [init() for init in (init,)*10]

(This is actually ~20% faster on my machine than the original version
with the dummy variable, since it moves the sequence repetition step
outside the loop and hence only does it once rather than 10 times)

And that can be factored out into a "repeat_call" helper function,
with itertools.repeat making it easy to avoid actually creating a
tuple:

    from itertools import repeat

    def repeat_call(callable, n):
        for c in repeat(callable, n)
            yield c()

(You can also avoid the itertools dependency by using the dummy
variable formulation inside "repeat_call" without the difference being
visible to external code)

At that point, regardless of the internal implementation details of
`repeat_call`, the original example would just look like:

    d = list(repeat_call(([0]*5).copy, 10))

To say "give me a list containing 10 distinct lists, each containing 5 zeroes".

So *if* we were to add anything to the language here, it would be to
add `itertools.repeat_call` as a new iteration primitive, since it
isn't entirely straightforward to construct that operation out of the
existing primitives, with itertools.starmap coming closest:

    def repeat_call(callable, n):
        yield from starmap(callable, repeat((), n))

But the explicit for loop being clearest:

    def repeat_call(callable, n):
        for __ in range(n):
            yield callable()

Cheers,
Nick.

P.S. The common problems shared by all of the `repeat_call`
formulations in this post are that they don't set __length_hint__
appropriately, and hence lose efficiency when using them to build
containers, and also they don't have nice representations they way
other itertools objects do.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list