[Python-ideas] adding an __exec__ method to context managers?
Sturla Molden
sturla at molden.no
Tue Oct 13 03:46:22 CEST 2009
I have been trying to implement an OpenMP-like threading API for Python.
For those who don't know OpenMP, it is an alternative threading API for
C, C++ and Fortran, that unlike pthreads and Win32 threads has an
intuitive syntax. For example, a parallel loop becomes:
#pragma omp parallel for private(i)
for (i=0; i<n; i++)
/* whatever */
A simple pragma tells that the loop is parallel, and that the counter is
private to the thread. Calling multiple task in separate threads are
equally simple:
#pragma omp parallel sections
{
#pragma omp section
task1();
#pragma omp section
task2();
#pragma omp section
task3();
}
Synchronization is taken care of using pragmas as well, for example:
#pragma omp critical
{
/* sunchronized with a mutex */
}
The virtue is that one can take sequential code, add in a few pragmas
here and there, and end up having a multi-threaded program a human can
understand. All the mess that makes multi-threaded programs error-prone
and difficult to write is taken care of by the compiler.
Ok, so what has this to do with Python?
First Python already has the main machinery for OpenMP-like syntax using
closures and decorators. For example, if we have a seqential function:
def foobar():
for i in range(100):
dostuff1(i) # thread safe function
dostuff2(i) # not thread safe function
We could imagine rewriting this using a magical module "pymp" (which
actually exist on my computer) as
import pymp
def foobar():
@pymp.parallel_for
def _(mt):
for i in mt.iter:
dostuff1(i)
with mt.critical:
dostuff2(i)
_(range(100))
However, the closure is awkward, and it must be called with the iterable
at the end. It looks messy, which is unpythonic. Another case would be
parallel sections:
def foobar():
task1()
task2()
task3()
import pymp
def foobar():
@pymp.parallel_sections
def _(mt):
mt.section(task1)
mt.section(task2)
mt.section(task3)
_()
It occurs to me that all this would be a lot cleaner if context managers
had an optional __exec__ method. It would receive the body as a code
object, together with the local and global dicts for controlled
execution. If it does not exist, something like this would be assumed:
class ctxmgr(object):
def __enter__(self): pass
def __exit__(self, exc_type, exc_val, exc_tb): pass
def __exec__(self, body, _globals, _locals):
eval(body, _globals, _locals)
Now we could e.g. imagine using this syntax instead, using __exec__ to
control execution in threads:
def foobar():
with pymp.parallel_for( range(100) ) as mt:
for i in mt.iter:
dostuff1(i)
with mt.critical:
dostuff2(i)
def foobar():
with pymp.parallel_sections as mt:
with mt.section:
task1()
with mt.section:
task2()
with mt.section:
task3()
Now it even looks cleaner than OpenMP pragmas in C.
This would be one use case for an __exec__ method, I am sure there are
others.
Is this an idea worthy a PEP? What do you think?
Regards,
Sturla Molden
P.S. Yes I know about the GIL. It can be released by C extensions (incl.
Cython/Pyrex, f2py, ctypes). Python threads are perfectly fine for
course-grained parallelism in numerical code, with scalability and
performance close to GCC's OpenMP (I have tried). And AFAIK, IronPython
and Jython does not even have a GIL. Keep this out of the discussion please.
More information about the Python-ideas
mailing list