adding an __exec__ method to context managers?

I have been trying to implement an OpenMP-like threading API for Python. For those who don't know OpenMP, it is an alternative threading API for C, C++ and Fortran, that unlike pthreads and Win32 threads has an intuitive syntax. For example, a parallel loop becomes: #pragma omp parallel for private(i) for (i=0; i<n; i++) /* whatever */ A simple pragma tells that the loop is parallel, and that the counter is private to the thread. Calling multiple task in separate threads are equally simple: #pragma omp parallel sections { #pragma omp section task1(); #pragma omp section task2(); #pragma omp section task3(); } Synchronization is taken care of using pragmas as well, for example: #pragma omp critical { /* sunchronized with a mutex */ } The virtue is that one can take sequential code, add in a few pragmas here and there, and end up having a multi-threaded program a human can understand. All the mess that makes multi-threaded programs error-prone and difficult to write is taken care of by the compiler. Ok, so what has this to do with Python? First Python already has the main machinery for OpenMP-like syntax using closures and decorators. For example, if we have a seqential function: def foobar(): for i in range(100): dostuff1(i) # thread safe function dostuff2(i) # not thread safe function We could imagine rewriting this using a magical module "pymp" (which actually exist on my computer) as import pymp def foobar(): @pymp.parallel_for def _(mt): for i in mt.iter: dostuff1(i) with mt.critical: dostuff2(i) _(range(100)) However, the closure is awkward, and it must be called with the iterable at the end. It looks messy, which is unpythonic. Another case would be parallel sections: def foobar(): task1() task2() task3() import pymp def foobar(): @pymp.parallel_sections def _(mt): mt.section(task1) mt.section(task2) mt.section(task3) _() It occurs to me that all this would be a lot cleaner if context managers had an optional __exec__ method. It would receive the body as a code object, together with the local and global dicts for controlled execution. If it does not exist, something like this would be assumed: class ctxmgr(object): def __enter__(self): pass def __exit__(self, exc_type, exc_val, exc_tb): pass def __exec__(self, body, _globals, _locals): eval(body, _globals, _locals) Now we could e.g. imagine using this syntax instead, using __exec__ to control execution in threads: def foobar(): with pymp.parallel_for( range(100) ) as mt: for i in mt.iter: dostuff1(i) with mt.critical: dostuff2(i) def foobar(): with pymp.parallel_sections as mt: with mt.section: task1() with mt.section: task2() with mt.section: task3() Now it even looks cleaner than OpenMP pragmas in C. This would be one use case for an __exec__ method, I am sure there are others. Is this an idea worthy a PEP? What do you think? Regards, Sturla Molden P.S. Yes I know about the GIL. It can be released by C extensions (incl. Cython/Pyrex, f2py, ctypes). Python threads are perfectly fine for course-grained parallelism in numerical code, with scalability and performance close to GCC's OpenMP (I have tried). And AFAIK, IronPython and Jython does not even have a GIL. Keep this out of the discussion please.

Sturla Molden skrev:
This would actually be even messier if we were to emulate OpenMP completely. def foobar(): @pymp.parallel_sections def _(mt): @mt.section def _1(): task1() @mt.section def _2(): task2() @mt.section def _3(): task3() _1(); _2(); _3() _()
Intendation got messed up by Thunderbird :-( Another attempt: def foobar(): with pymp.parallel_sections as mt: with mt.section: task1() with mt.section: task2() with mt.section: task3() S.M.

This is definitely an idea that python-ideas has seen before. Just a couple months ago, when the syntax of "with" was being changed to allow for "with a, b, c" this was kicked around as a possible improvement to the with statement. Taking a step back, it seems like what you really want is some easy way to create callbacks, just like Ruby blocks or the new Objective-C blocks. There are a number of ways this could be done: 1. Some kind of multiline lambda. (This is generally considered to be unpythonic.) 2. Relaxing the restrictions on decorators so that, eg. this, is legal: @pymp.parallel_for(range(100)) def _(mt): for i in mt.iter: dostuff1(i) with mt.critical: dostuff2(i) Then you can have it auto-call itself and ignore the fact that _ will be the result (presumably None) and not a callable. 3. Some sort of "out of order operation" signal, as was batted around on the list a while back: result = pymp.parallel_for( ~~DEFINE_ME_NEXT~~, range(100)) def DEFINE_ME_NEXT(mt): ... There are many potential ways to spell that. 4. Some modification to the "with" statement, as you are proposing. The resistance that you will face with this idea is that it is significantly different from how "with" works now, since it does not create a block at all. Frankly I think this list is going to face proposals for some block substitute or another every couple months between now and whenever Python finally allows for some more readable way of passing functions to other functions. — Carl Johnson

Carl Johnson <cmjohnson.mailinglist@gmail.com> writes:
[…] Unless I misunderstand one or more of the options you present, you've omitted the most obvious way under current Python: create a function using ‘def’ and use the name to refer to that function object.
What is insufficiently readable about:: def foo(spam): process_wibble(spam) process_wobble(spam) return process_warble(spam) bar(foo) -- \ “I went to a garage sale. ‘How much for the garage?’ ‘It's not | `\ for sale.’” —Steven Wright | _o__) | Ben Finney

Ben Finney:
The problem with that is in 3 or 4 months someone will come back to Python-ideas with another theory of how to replace it. More seriously, it puts things out of order. If you had to write for-loops as def loop(item): process(item), etc. for(loop, iterator) it would be patently obvious that the for(loop, iterator) belongs at the top, not the bottom, so that you know *what* is being iterated before you find out *how* it's being iterated. For that matter, Python has changed the perfectly sensible: def method(cls): stuff() method = classmethod(method) to @classmethod def method(cls): stuff() Why? Because it's more readable to have the decorator up top, so you know what kind of function/method to expect. So, in this particular case, I think it's more readable to have the conditions at the top instead of the bottom. It's very natural when something that starts as for i in range(100): dostuff1(i) # thread safe function dostuff2(i) # not thread safe function becomes with parallelize(range(100)) as i: dostuff1(i) # thread safe function dostuff2(i) # not thread safe function (or some other way of writing the condition at the top instead of, such as a decorator etc.) instead of def f(i): dostuff1(i) # thread safe function dostuff2(i) # not thread safe function parallelize(f, range(100)) with the loop condition at the bottom. For whatever reason, people find putting the conditions out of order onerous, and Python-ideas won't be free of periodic interruptions until there someway to conditions into their mental order. — Carl

Carl Johnson <cmjohnson.mailinglist@gmail.com> writes:
So, in this particular case, I think it's more readable to have the conditions at the top instead of the bottom.
I'm not understanding “conditions” here. When thinking about programming languages, a “condition” is an expression evaluated in a boolean context. You seem to mean something different.
I'm not seeing how you got from the “start as” case to the latter cases. Where is the loop condition? -- \ “Whatever a man prays for, he prays for a miracle. Every prayer | `\ reduces itself to this: “Great God, grant that twice two be not | _o__) four.”” —Ivan Turgenev | Ben Finney

Carl Johnson wrote:
Ben Finney:
Part of the motivation for @ you are missing was the desire to not write the function name three times, especially when the name is long, as is required in some contexts where functions must be wrapped to interface with external systems. The classmethod use case alone would not have pushed the addition. tjr

Terry Reedy:
Yes, but if we go from using def to make a function and then giving that as a callback to using a multiline lambda, that's a drop from using the function name 2 times to using it 0 times: the same magnitude of a drop: -2 ! ;-D (Of course, I'm only mentioning this in jest, since multiline lambdas are unpythonic. I just wanted to point out that it's the same reduction in typing.) -- Carl

Carl Johnson writes:
I suspect all the blockheads<wink> will switch to Ruby before Python gets such a facility, given that lambda itself is considered an unfortunate un-Pythonic legacy by many, and lambda-with-suite quite beyond the pale. This use of with is the most plausible I've seen, though, I have to admit that.

On Mon, Oct 12, 2009 at 11:09 PM, Carl Johnson <cmjohnson.mailinglist@gmail.com> wrote:
Hmm... I saw this as bigger than a block; it was creating a full execution context, not just just a suite executed in a tweak of the current context. But I may have been reading too much into it. Would the __exec__ have its own globals and locals (which might default to a reference to or copy of the current ones)?
To me, "with" says: "Take the current execution context, tweak it, run the following suite, then untweak the context". An __exec__ just strengthens the possible separation between the inner and and outer contexts -- it may be slightly less efficient, but in return, it will be a better sandbox. OpenMP would be a special type of __exec__ that also happens to handle parallelization for you. -jJ

Hi Sturla! On Tue, 13 Oct 2009 03:46:22 +0200, Sturla Molden wrote:
[...] I had this *precise* conversation at SciPy'08 at length with Alex Martelli, and more briefly at SciPy'09 with Peter Norvig, because this is something I've been tossing around for quite a while. I actually implemented something similar in IPython, but which is horribly brittle because it digs the parent context out by source introspection (using a custom exception to stop the execution flow at __enter__ time). It worked actually perfectly, but I know such hacks aren't robust enough to be trusted in real production code. I am somewhat skeptical that this idea could really fly in the long haul, because of all the issues regarding exactly what gets passed in (for lots of neat things you might want the real sources and not just the code object), etc. Alex raised a number of detailed points in our conversation I don't have at the top of my head right now, but I could try to think a bit harder about the details if need be. However, recently I've found 'peace' with this topic by using a different approach, that takes care of the problem you highlight in your post with having to call the _() function afterwards. In python 3 with the new non- local keyword, this approach is actually very functional, and it solves lots of problems. It's simply a matter of having a decorator consume the called function directly. Rather than repeat all this, I'll point you to a page where I summarized the whole topic at a recent talk at our Berkeley Scientific Python users group (it also contains links to a detailed discussion on this topic we had on the ipython list): https://cirl.berkeley.edu/fperez/py4science/decorators.html My current thinking on this matter is that I'll use this approach for a while, to get a better feel for the possibilities. The syntax isn't ideal, but it's not horrible either, and it is quite flexible. I think some real-world experience with this approach can teach us a lot, in order to later revisit the question with a really solid proposal for either a with extension like __exec__ or something else. I hope this is useful, thanks a lot for bringing this up here (I've discussed this *exact* idea multiple times with colleagues, but never had the energy to carry it further on-list due to being too swamped with other things). Cheers, f

Sturla Molden wrote:
I have been trying to implement an OpenMP-like threading API for Python.
This need has (obviously) also been discussed on the Cython list and lead to this write-up: http://wiki.cython.org/enhancements/parallel and this ticket: http://trac.cython.org/cython_trac/ticket/211 The fact that this isn't an easy thing to decide (nor a major need, it seems) is reflected by the age of the Wiki page (June 2008, still undecided) and the amount of similar discussions on c.l.py and cython-dev/cython-ideas. I actually think that OpenMP support makes a lot more sense for Cython code (which can happily free the GIL at any granularity) than for Python code. Stefan

Stefan Behnel skrev:
I actually think that OpenMP support makes a lot more sense for Cython code (which can happily free the GIL at any granularity) than for Python code.
I am not talking about OpenMP support, I am talking about how we use Python threads. Another use case, BTW, which someone mailed me, is an improved sandbox for restricted execution. S.M.

Sturla Molden skrev:
I am not talking about OpenMP support, I am talking about how we use Python threads.
Which is to say that Java threads, which Python's threading module mimics, is a bad concurrency abstraction. It is error prone and difficult to use (does not fit the programmer's mind). Another ting is that some of this is to some extent a compensation for lack of os.fork in Windows. On Linux, one could call os.fork in __enter__, and achieve much the same effect as an __exec__ method. That is, the parent forks, raises an exception (jumps to __exit__), and calls os.waitpid there. The child executes the "with ctxmgr:" block, and calls sys.exit (or os._exit) in __exit__. That kind of scheme would not work on Windows (well there is Cygwin and SUA...) It would also be somewhat limited by the child and parent not sharing memory space. So I still think there is justification for a __exec__ method in context managers. By the way: I tried to implement this using a bytecode hack a while ago. That is, spawn a thread and execute the code object from the calling stack frame in __enter__. It failed whenever the with block contained a loop, as there was some variable (I think it was called _1) that the interpreter could not find. And bytecode hacks are not particularly reliable and portable either. S.M.

Antoine Pitrou skrev:
I could show you a test I did on my laptop (dual core) a while ago: http://folk.uio.no/sturlamo/kdtree/benchmark-27022009.png The black line is scipy.spatial.cKDTree (Cython). The green line is scipy.spatial.cKDTree modified to use Python threads (GIL released whenever possible). The red line is scipy.spatial.cKDTree with some parts re-written in C, and using OpenMP. This is hardly surprising, as Python threads are just native OS threads. The slightly reduced performance of Python threads probably comes from contention for the GIL in parts of the Cython code. At least for numerical code, the heavy lifiting is done in special C and Fortran libraries such as ATLAS/LAPACK, Intel MKL, FFTW, and MINPACK. Even for code we write completely ourselves, there will always be some performance critical parts in Cython, C/C++ or Fortran. We can thus release the GIL around the worst bottlenecks, and use multi-threading in Python. The GIL becomes an issue if you never release it. S.M.

On Mon, Oct 12, 2009 at 7:46 PM, Sturla Molden <sturla@molden.no> wrote:
<snip>
<snip>
<snip> Have you looked at the API for python-safethread [http://code.google.com/p/python-safethread/wiki/Branching]? I think an API combining that and your semantics would be very cool. If you use the same technique of passing functions into the context manager, your examples become: def foobar(): with pymp.parallel_for(range(100)) as mt: for i in mt.iter: mt.add(lambda : dostuff1(i)) mt.critical(lambda : dostuff2(i)) def foobar(): with pymp.parallel_sections as mt: mt.section(task1) mt.section(task2) mt.section(task3) Another option would be to execute the closure under the covers, e.g. # The function dostuff_in_parallel is called when necessary by the pymp object. @pymp.parallel_for(range(100)) def dostuff_in_parallel(mt): dostuff1(i) with mt.critical: dostuff2(i) # The function sections is called when necessary by the pymp object, e.g. by the parallel_sections decorator. @pymp.parallel_sections def sections(mt): mt.section(task1) mt.section(task2) mt.section(task3) There are a couple PEPs about adding blocks to python that were rejected in favor of the more constrained with statement, so you may want to look at those as well. Cheers, ===== --Ryan E. Freckleton

Ryan Freckleton skrev:
I am beginning to feel a little stupid. :-( Right... a decorator would always be called for a closure, not just once on module import. So I could make function "def parallel_for(iterable)" return a decorator and auto-execute the closure. That would indeed give cleaner decorator syntax instead of calling the decorated closure manually. S.M.

It seems to me that the right way to support this kind of thing is decorators on suites: def sample_for(): @parallel.for: for i in range(n): do_something(i) def sample_sections(): @parallel.sections: @parallel.section: do_one() @parallel.section: do_two() @parallel.section: do_three() Now there's a lot I don't know about python internals that may make this more/less practical. The suite decorator would of course be able to do the same kinds of things that a function decorator can so this is not just limited to the parallel example. --- Bruce On Mon, Oct 12, 2009 at 6:46 PM, Sturla Molden <sturla@molden.no> wrote:

2009/10/13 Bruce Leban <bruce@leapyear.org>:
You don't need decorators on suites - at a minimum you can use a defined function (possibly with a throwaway name like _): def sample_for(): @parallel.for def _(): for i in range(n): do_something(i) You just need to modify parallel.for to call the newly defined function as follows (pseudocode): def parallel.for(fn): ... old body of parallel.for ... assume it returns a function inner ... instead of "return inner", do inner() return None Summary: you don't need decorators on bare blocks, just name the block. (OK, scope issues may impact this, but you didn't define how scope is handled in a "decorated block" in any case, so I choose to assume it introduces a new scope just like def does :-)) Paul.

Sturla Molden skrev:
This would actually be even messier if we were to emulate OpenMP completely. def foobar(): @pymp.parallel_sections def _(mt): @mt.section def _1(): task1() @mt.section def _2(): task2() @mt.section def _3(): task3() _1(); _2(); _3() _()
Intendation got messed up by Thunderbird :-( Another attempt: def foobar(): with pymp.parallel_sections as mt: with mt.section: task1() with mt.section: task2() with mt.section: task3() S.M.

This is definitely an idea that python-ideas has seen before. Just a couple months ago, when the syntax of "with" was being changed to allow for "with a, b, c" this was kicked around as a possible improvement to the with statement. Taking a step back, it seems like what you really want is some easy way to create callbacks, just like Ruby blocks or the new Objective-C blocks. There are a number of ways this could be done: 1. Some kind of multiline lambda. (This is generally considered to be unpythonic.) 2. Relaxing the restrictions on decorators so that, eg. this, is legal: @pymp.parallel_for(range(100)) def _(mt): for i in mt.iter: dostuff1(i) with mt.critical: dostuff2(i) Then you can have it auto-call itself and ignore the fact that _ will be the result (presumably None) and not a callable. 3. Some sort of "out of order operation" signal, as was batted around on the list a while back: result = pymp.parallel_for( ~~DEFINE_ME_NEXT~~, range(100)) def DEFINE_ME_NEXT(mt): ... There are many potential ways to spell that. 4. Some modification to the "with" statement, as you are proposing. The resistance that you will face with this idea is that it is significantly different from how "with" works now, since it does not create a block at all. Frankly I think this list is going to face proposals for some block substitute or another every couple months between now and whenever Python finally allows for some more readable way of passing functions to other functions. — Carl Johnson

Carl Johnson <cmjohnson.mailinglist@gmail.com> writes:
[…] Unless I misunderstand one or more of the options you present, you've omitted the most obvious way under current Python: create a function using ‘def’ and use the name to refer to that function object.
What is insufficiently readable about:: def foo(spam): process_wibble(spam) process_wobble(spam) return process_warble(spam) bar(foo) -- \ “I went to a garage sale. ‘How much for the garage?’ ‘It's not | `\ for sale.’” —Steven Wright | _o__) | Ben Finney

Ben Finney:
The problem with that is in 3 or 4 months someone will come back to Python-ideas with another theory of how to replace it. More seriously, it puts things out of order. If you had to write for-loops as def loop(item): process(item), etc. for(loop, iterator) it would be patently obvious that the for(loop, iterator) belongs at the top, not the bottom, so that you know *what* is being iterated before you find out *how* it's being iterated. For that matter, Python has changed the perfectly sensible: def method(cls): stuff() method = classmethod(method) to @classmethod def method(cls): stuff() Why? Because it's more readable to have the decorator up top, so you know what kind of function/method to expect. So, in this particular case, I think it's more readable to have the conditions at the top instead of the bottom. It's very natural when something that starts as for i in range(100): dostuff1(i) # thread safe function dostuff2(i) # not thread safe function becomes with parallelize(range(100)) as i: dostuff1(i) # thread safe function dostuff2(i) # not thread safe function (or some other way of writing the condition at the top instead of, such as a decorator etc.) instead of def f(i): dostuff1(i) # thread safe function dostuff2(i) # not thread safe function parallelize(f, range(100)) with the loop condition at the bottom. For whatever reason, people find putting the conditions out of order onerous, and Python-ideas won't be free of periodic interruptions until there someway to conditions into their mental order. — Carl

Carl Johnson <cmjohnson.mailinglist@gmail.com> writes:
So, in this particular case, I think it's more readable to have the conditions at the top instead of the bottom.
I'm not understanding “conditions” here. When thinking about programming languages, a “condition” is an expression evaluated in a boolean context. You seem to mean something different.
I'm not seeing how you got from the “start as” case to the latter cases. Where is the loop condition? -- \ “Whatever a man prays for, he prays for a miracle. Every prayer | `\ reduces itself to this: “Great God, grant that twice two be not | _o__) four.”” —Ivan Turgenev | Ben Finney

Carl Johnson wrote:
Ben Finney:
Part of the motivation for @ you are missing was the desire to not write the function name three times, especially when the name is long, as is required in some contexts where functions must be wrapped to interface with external systems. The classmethod use case alone would not have pushed the addition. tjr

Terry Reedy:
Yes, but if we go from using def to make a function and then giving that as a callback to using a multiline lambda, that's a drop from using the function name 2 times to using it 0 times: the same magnitude of a drop: -2 ! ;-D (Of course, I'm only mentioning this in jest, since multiline lambdas are unpythonic. I just wanted to point out that it's the same reduction in typing.) -- Carl

Carl Johnson writes:
I suspect all the blockheads<wink> will switch to Ruby before Python gets such a facility, given that lambda itself is considered an unfortunate un-Pythonic legacy by many, and lambda-with-suite quite beyond the pale. This use of with is the most plausible I've seen, though, I have to admit that.

On Mon, Oct 12, 2009 at 11:09 PM, Carl Johnson <cmjohnson.mailinglist@gmail.com> wrote:
Hmm... I saw this as bigger than a block; it was creating a full execution context, not just just a suite executed in a tweak of the current context. But I may have been reading too much into it. Would the __exec__ have its own globals and locals (which might default to a reference to or copy of the current ones)?
To me, "with" says: "Take the current execution context, tweak it, run the following suite, then untweak the context". An __exec__ just strengthens the possible separation between the inner and and outer contexts -- it may be slightly less efficient, but in return, it will be a better sandbox. OpenMP would be a special type of __exec__ that also happens to handle parallelization for you. -jJ

Hi Sturla! On Tue, 13 Oct 2009 03:46:22 +0200, Sturla Molden wrote:
[...] I had this *precise* conversation at SciPy'08 at length with Alex Martelli, and more briefly at SciPy'09 with Peter Norvig, because this is something I've been tossing around for quite a while. I actually implemented something similar in IPython, but which is horribly brittle because it digs the parent context out by source introspection (using a custom exception to stop the execution flow at __enter__ time). It worked actually perfectly, but I know such hacks aren't robust enough to be trusted in real production code. I am somewhat skeptical that this idea could really fly in the long haul, because of all the issues regarding exactly what gets passed in (for lots of neat things you might want the real sources and not just the code object), etc. Alex raised a number of detailed points in our conversation I don't have at the top of my head right now, but I could try to think a bit harder about the details if need be. However, recently I've found 'peace' with this topic by using a different approach, that takes care of the problem you highlight in your post with having to call the _() function afterwards. In python 3 with the new non- local keyword, this approach is actually very functional, and it solves lots of problems. It's simply a matter of having a decorator consume the called function directly. Rather than repeat all this, I'll point you to a page where I summarized the whole topic at a recent talk at our Berkeley Scientific Python users group (it also contains links to a detailed discussion on this topic we had on the ipython list): https://cirl.berkeley.edu/fperez/py4science/decorators.html My current thinking on this matter is that I'll use this approach for a while, to get a better feel for the possibilities. The syntax isn't ideal, but it's not horrible either, and it is quite flexible. I think some real-world experience with this approach can teach us a lot, in order to later revisit the question with a really solid proposal for either a with extension like __exec__ or something else. I hope this is useful, thanks a lot for bringing this up here (I've discussed this *exact* idea multiple times with colleagues, but never had the energy to carry it further on-list due to being too swamped with other things). Cheers, f

Sturla Molden wrote:
I have been trying to implement an OpenMP-like threading API for Python.
This need has (obviously) also been discussed on the Cython list and lead to this write-up: http://wiki.cython.org/enhancements/parallel and this ticket: http://trac.cython.org/cython_trac/ticket/211 The fact that this isn't an easy thing to decide (nor a major need, it seems) is reflected by the age of the Wiki page (June 2008, still undecided) and the amount of similar discussions on c.l.py and cython-dev/cython-ideas. I actually think that OpenMP support makes a lot more sense for Cython code (which can happily free the GIL at any granularity) than for Python code. Stefan

Stefan Behnel skrev:
I actually think that OpenMP support makes a lot more sense for Cython code (which can happily free the GIL at any granularity) than for Python code.
I am not talking about OpenMP support, I am talking about how we use Python threads. Another use case, BTW, which someone mailed me, is an improved sandbox for restricted execution. S.M.

Sturla Molden skrev:
I am not talking about OpenMP support, I am talking about how we use Python threads.
Which is to say that Java threads, which Python's threading module mimics, is a bad concurrency abstraction. It is error prone and difficult to use (does not fit the programmer's mind). Another ting is that some of this is to some extent a compensation for lack of os.fork in Windows. On Linux, one could call os.fork in __enter__, and achieve much the same effect as an __exec__ method. That is, the parent forks, raises an exception (jumps to __exit__), and calls os.waitpid there. The child executes the "with ctxmgr:" block, and calls sys.exit (or os._exit) in __exit__. That kind of scheme would not work on Windows (well there is Cygwin and SUA...) It would also be somewhat limited by the child and parent not sharing memory space. So I still think there is justification for a __exec__ method in context managers. By the way: I tried to implement this using a bytecode hack a while ago. That is, spawn a thread and execute the code object from the calling stack frame in __enter__. It failed whenever the with block contained a loop, as there was some variable (I think it was called _1) that the interpreter could not find. And bytecode hacks are not particularly reliable and portable either. S.M.

Antoine Pitrou skrev:
I could show you a test I did on my laptop (dual core) a while ago: http://folk.uio.no/sturlamo/kdtree/benchmark-27022009.png The black line is scipy.spatial.cKDTree (Cython). The green line is scipy.spatial.cKDTree modified to use Python threads (GIL released whenever possible). The red line is scipy.spatial.cKDTree with some parts re-written in C, and using OpenMP. This is hardly surprising, as Python threads are just native OS threads. The slightly reduced performance of Python threads probably comes from contention for the GIL in parts of the Cython code. At least for numerical code, the heavy lifiting is done in special C and Fortran libraries such as ATLAS/LAPACK, Intel MKL, FFTW, and MINPACK. Even for code we write completely ourselves, there will always be some performance critical parts in Cython, C/C++ or Fortran. We can thus release the GIL around the worst bottlenecks, and use multi-threading in Python. The GIL becomes an issue if you never release it. S.M.

On Mon, Oct 12, 2009 at 7:46 PM, Sturla Molden <sturla@molden.no> wrote:
<snip>
<snip>
<snip> Have you looked at the API for python-safethread [http://code.google.com/p/python-safethread/wiki/Branching]? I think an API combining that and your semantics would be very cool. If you use the same technique of passing functions into the context manager, your examples become: def foobar(): with pymp.parallel_for(range(100)) as mt: for i in mt.iter: mt.add(lambda : dostuff1(i)) mt.critical(lambda : dostuff2(i)) def foobar(): with pymp.parallel_sections as mt: mt.section(task1) mt.section(task2) mt.section(task3) Another option would be to execute the closure under the covers, e.g. # The function dostuff_in_parallel is called when necessary by the pymp object. @pymp.parallel_for(range(100)) def dostuff_in_parallel(mt): dostuff1(i) with mt.critical: dostuff2(i) # The function sections is called when necessary by the pymp object, e.g. by the parallel_sections decorator. @pymp.parallel_sections def sections(mt): mt.section(task1) mt.section(task2) mt.section(task3) There are a couple PEPs about adding blocks to python that were rejected in favor of the more constrained with statement, so you may want to look at those as well. Cheers, ===== --Ryan E. Freckleton

Ryan Freckleton skrev:
I am beginning to feel a little stupid. :-( Right... a decorator would always be called for a closure, not just once on module import. So I could make function "def parallel_for(iterable)" return a decorator and auto-execute the closure. That would indeed give cleaner decorator syntax instead of calling the decorated closure manually. S.M.

It seems to me that the right way to support this kind of thing is decorators on suites: def sample_for(): @parallel.for: for i in range(n): do_something(i) def sample_sections(): @parallel.sections: @parallel.section: do_one() @parallel.section: do_two() @parallel.section: do_three() Now there's a lot I don't know about python internals that may make this more/less practical. The suite decorator would of course be able to do the same kinds of things that a function decorator can so this is not just limited to the parallel example. --- Bruce On Mon, Oct 12, 2009 at 6:46 PM, Sturla Molden <sturla@molden.no> wrote:

2009/10/13 Bruce Leban <bruce@leapyear.org>:
You don't need decorators on suites - at a minimum you can use a defined function (possibly with a throwaway name like _): def sample_for(): @parallel.for def _(): for i in range(n): do_something(i) You just need to modify parallel.for to call the newly defined function as follows (pseudocode): def parallel.for(fn): ... old body of parallel.for ... assume it returns a function inner ... instead of "return inner", do inner() return None Summary: you don't need decorators on bare blocks, just name the block. (OK, scope issues may impact this, but you didn't define how scope is handled in a "decorated block" in any case, so I choose to assume it introduces a new scope just like def does :-)) Paul.
participants (14)
-
alex23
-
Antoine Pitrou
-
Ben Finney
-
Bruce Leban
-
Carl Johnson
-
Fernando Perez
-
geremy condra
-
Jim Jewett
-
Paul Moore
-
Ryan Freckleton
-
Stefan Behnel
-
Stephen J. Turnbull
-
Sturla Molden
-
Terry Reedy