[Python-ideas] Decorators on loops
Enric Tejedor
enric.tejedor at bsc.es
Wed Jan 8 18:47:45 CET 2014
Hi,
> The biggest problem with that kind of magic is scoping. Look at this:
>
> def func():
> best = 0
> for x in range(10):
> val = long_computation(x)
> if val > best: best = val
> return best
>
> (Granted, this can be done with builtins, but let's keep the example simple.)
>
> If the body of the loop becomes a new function, there needs to be a
> nonlocal directive to make sure 'best' references the outer one:
>
> def func():
> best = 0
> @parallelize(range(10))
> def body(x):
> nonlocal best
> val = long_computation(x)
> if val > best: best = val
> return best
>
> This syntax would work, but it'll raise UnboundLocalError without the
> nonlocal declaration. Since Python tags non-local variables (as
> opposed to C-like languages, which tag local variables), there's no
> easy way to just add another scope and have it function invisibly. Any
> bit of magic that creates a local scope is going to cause problems in
> any but the simplest cases. Far better to force people to be explicit
> about it, and then the rules are clearer.
>
> Note that the parallelize decorator I use here would be a little
> unusual, in that it has to actually call the function (and in fact
> call it multiple times), and its return value is ignored. This would
> work, but it might confuse people, so you'd want to name it something
> that explains what's happening. It wouldn't be hard to write, in this
> form, though - it'd basically just pass the iterable to
> multiprocessing.Pool().map().
>
> However, the example I give here wouldn't work (at least, I don't
> think it would) with multiprocessing, because external variable scopes
> would be duplicated, not shared, between processes. So once again,
> you'd have to write your code with parallelization in mind, rather
> than simply stick a decorator on a loop and have it fork out across
> processes.
>
Correct, this is indeed a problem. It would be tricky to make this work
in the general case.
In a simpler scenario, we could assume that iterations won't update the
same data.
On the other hand, to prevent the UnboundLocalError, the variables
needed inside the loop could be passed to the decorator and appear in
the loop function's signature.
results = [0] * 10
@parallel(range(10), results)
def loop(i, results):
results[i] = some_computation(i)
Then the decorator would be:
def parallel(*args):
iterable = args[0]
params = args[1:]
def call(func):
# create parallel invocations of func with iterable and params
return call
I think this solution would work if you wanted to do things like
performing independent updates on a list.
Anyway now I see more clearly the implications of such a construct for
loops. Thanks again for your feedback,
Enric
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
WARNING / LEGAL TEXT: This message is intended only for the use of the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or exempt
from disclosure under applicable law. If you are not the intended
recipient or the person responsible for delivering the message to the
intended recipient, you are strictly prohibited from disclosing,
distributing, copying, or in any way using this message. If you have
received this communication in error, please notify the sender and
destroy and delete any copies you may have received.
http://www.bsc.es/disclaimer
More information about the Python-ideas
mailing list