[Python-ideas] Loop manager syntax

Andrew Barnert abarnert at yahoo.com
Tue Jul 28 19:02:09 CEST 2015


On Jul 28, 2015, at 15:28, Todd <toddrjen at gmail.com> wrote:
> 
> Following the discussion of the new "async" keyword, I think it would be useful to provide a generic way to alter the behavior of loops.  My idea is to allow a user to take control over the operation of a "for" or "while" loop.  
> 
> The basic idea is similar to context managers, where an object implementing certain magic methods, probably "__for__" and "__while__", could be placed in front of a "for" or "while" statement, respectively.  This class would then be put in charge of carrying out the loop.  Due to the similarity to context managers, I am tentatively calling this a "loop manager".
> 
> What originally prompted this idea was parallelization.  For example the "multiprocessing.Pool" class could act as a "for" loop manager, allowing you to do something like this:
> 
>  >>> from multiprocessing import Pool
>  >>>
>  >>> Pool() for x in range(20):
>  ...    do_something
>  ...
>  >>>
> 
> The body of the "for" loop would then be run in parallel.

First, this code would create a Pool, use it, and leak it. And yes, sure, you could wrap this all in a with statement, but then the apparent niceness that seems to motivate the idea disappears.

Second, what does the Pool.__for__ method get called with here? There's an iterable, a variable name that it has to somehow assign to in the calling function's scope, and some code (in what form exactly?) that it has to execute in that calling function's scope.

You could do something like this for the most trivial __for__ method:

    def __for__(self, scope: ScopeType, iterable: Iterable, name: str, code: CodeType):
        for x in iterable:
            scope.assign(name, x)
            try:
                exec(code, scope)
            except LoopBreak:
                break
            except LoopContinue:
                continue
            except LoopYield as y
                make calling function yield?!
            except LoopReturn as r:
                make calling function return?!

It would take a nontrivial change to the compiler to compile the body of the loop into a separate code object, but with assignments still counted in the outer scope's locals list, yield expressions still making the outer function into a generator function, etc. You'd need to invent this new scope object type (just passing local, nonlocal, global dicts won't work because you can have assignments inside a loop body). Making yield, yield from, and return act on the calling function is bad enough, but for the first two, you need some way to also resume into the loop code later. 

If you designed a full "degenerate function" that solved all of these problems, I think that would be more useful than this proposal; different people have tried to come up with ways of doing that for making continuations for various custom-control-flow-without-macros purposes, and it doesn't seem like an easy problem.

But that still doesn't get you anywhere near what you need for this proposal, because your motivating example is trying to run the code in parallel. What exactly happens when one iteration does a break and 7 others are running at the same time? Do you change the semantics of break so it breaks "within a few iterations", or add a way to cancel existing iterations and roll back any changes they'd made to the scope, or...? And return and yield seem even more problematic here.

And, beyond the problems with concurrency, you have cross-process problems. For example, how do you pickle a live scope from one interpreter, pass it to another interpreter, and make it work on the first interpreter's scope?

And that may not be all the problems you'd need to solve to turn this into a real proposal.

> However, there are other uses as well.  For example, Python has no "do...while" structure, because nobody has come up with a clean way to do it (and probably nobody ever will).  However, under this proposal it would be possible for a third-party package to implement a "while" loop manager that can provide this functionality:

If someone can come up with a clean way to write this do object (even ignoring the fact that it appears to be a weird singleton global object--unless, contrary to other protocols, this one allows you do define the magic methods as @classmethods and then knows how to call them appropriately), why hasn't anyone come up with a clean way of writing a do...while structure? How would it be easier this way?

>  >>> from blah import do
>  >>>
>  >>> x = 10
>  >>> do while x < 20:
>  ... x += 1
>  ...
>  >>>
> 
> The "do" class would just defer running the conditional until after executing the body of the "while" loop once.
> 
> Another possible use-case would be to alter how the loop interacts with the surrounding namespace.  It would be possible to limit the loop so only particular variables become part of the local namespace after the loop is finished, or just prevent the index from being preserved after a "for" loop is finished.

Just designing the scope object that would give you a way to do this sounds like a big enough proposal on its own.

Maybe you could do this in CPython by exposing the LocalsToFast and FastToLocals methods on frame objects, adding a frame constructor, and then wrapping that up in something (in pure Python) that has a nicer API for the purpose and disguises the fact that you're actually passing around interpreter frames. You might even be able to pull off a test implementation without hacking the interpreter by using ctypes.pythonapi?

> I think, like context managers, this would provide a great deal of flexibility to the language and allow a lot of useful behaviors.  Of course the syntax and details are just strawmen examples at this point, there may be much better syntaxes.  But I think the basic idea of being able to control a loop in a manner like this is important.

The major difference between this proposal and context managers is that you want to be able to have the loop manager drive the execution of its suite, while a context manager can't do that; it just has __enter__ and __exit__ methods that get called before and after the suite is executed normally. That's how it avoids all of the problems here. Of course it still uses a change to the interpreter to allow the __exit__ method to get called as part of exception handling, but it's easy to see how you could have implemented it as a source transformation into a try/finally, in which case it wouldn't have needed any new interpreter functionality at all.

Maybe there's some way to rework your proposal into something that gets called to set up the loop, before and after the __next__ or expression test (with the after being passed the value and returning an optionally different value), and before and after each execution of the suite (the last two being very similar to what a context manager does). I don't see how any such thing could cause the suite to get executed in a process pool or in an isolated scope or any of your other motivating examples except the do...while simulator, but just because I'm not clever enough to see it doesn't mean you might not be.



More information about the Python-ideas mailing list