Hey,<div><br></div><div>So far people have been enthusiastic about the cython.parallel features, I think we should introduce some new features. I propose the following, assume parallel has been imported from cython:</div><div>
<br></div><div>with parallel.master():</div><div> this is executed in the master thread in a parallel (non-prange) section</div><div><br></div><div>with parallel.single():</div><div> same as master, except any thread may do the execution</div>
<div><br></div><div>An optional keyword argument 'nowait' specifies whether there will be a barrier at the end. The default is to wait.</div><div><br></div><div>with parallel.task():</div><div> create a task to be executed by some thread in the team</div>
<div> once a thread takes up the task it shall only be executed by that thread and no other thread (so the task will be tied to the thread)</div><div><br></div><div> C variables will be firstprivate</div><div> Python objects will be shared</div>
<div><br></div><div>parallel.taskwait() # wait on any direct descendent tasks to finish</div><div><br></div><div>with parallel.critical():</div><div> this section of code is mutually exclusive with other critical sections</div>
<div> </div><div> optional keyword argument 'name' specifies a name for the critical section, </div><div> which means all sections with that name will exclude each other, but not</div><div> critical sections with different names</div>
<div><br></div><div> Note: all threads that encounter the section will execute it, just not at the same time</div><div><br></div><div>with parallel.barrier():</div><div> all threads wait until everyone has reached the barrier</div>
<div> either no one or everyone should encounter the barrier</div><div> shared variables are flushed</div><div><br></div><div>Unfortunately, gcc again manages to horribly break master and single constructs in loops (versions 4.2 throughout 4.6), so I suppose I'll first file a bug report. Other (better) compilers like Portland (and I'm sure Intel) work fine. I suppose a warning in the documentation will suffice there.</div>
<div><br></div><div>If we at some point implement vector/SIMD operations we could also try out the Fortran openmp workshare construct.</div><div><br></div><div>What do you guys think?</div><div><br></div><div>Mark</div>