[Cython] CEP: prange for parallel loops
Pauli Virtanen
pav at iki.fi
Tue Apr 5 14:55:36 CEST 2011
Mon, 04 Apr 2011 21:26:34 +0200, mark florisson wrote:
[clip]
> For clarity, I'll add an example:
[clip]
How about making all the special declarations explicit? The automatic
inference of variables has a problem in that a small change in a part of
the code can have somewhat unintuitive non-local effects, as the private/
shared/reduction status of the variable changes in the whole function
scope (if Python scoping is retained).
Like so with explicit declarations:
def f(np.ndarray[double] x, double alpha):
cdef double alpha = 6.6
cdef char *ptr = something()
# Parallel variables are declared beforehand;
# the exact syntax could also be something else
cdef cython.parallel.private[int] tmp = 2, tmp2
cdef cython.parallel.reduction[int] s = 0
# Act like ordinary cdef outside prange(); in the prange they are
# firstprivate if initialized or written to outside the loop anywhere
# in the scope. Or, they could be firstprivate always, if this
# has a negligible performance impact.
tmp = 3
with nogil:
s = 9
for i in prange(x.shape[0]):
if cython.parallel.first_iteration(i):
# whatever initialization; Cython is in principle allowed
# to move this outside the loop, at least if it is
# the first thing here
pass
# tmp2 is not firstprivate, as it's not written to outside
# the loop body; also, it's also not lastprivate as it's not
# read outside the loop
tmp2 = 99
# Increment a private variable
tmp += 2*tmp
# Add stuff to reduction
s += alpha*i
# The following raise a compilation error -- the reduction
# variable cannot be assigned to, and can be only operated on
# with only a single reduction operation inside prange
s *= 9
s = 8
# It can be read, however, provided openmp supports this
tmp = s
# Assignment to non-private variables causes a compile-time
# error; this avoids common mistakes, such as forgetting to
# declare the reduction variable.
alpha += 42
alpha123 = 9
ptr = 94
# These, however, need to be allowed:
# the users are on their own to make sure they don't clobber
# non-local variables
x[i] = 123
(ptr + i)[0] = 123
some_routine(x, ptr, i)
else:
# private variables are lastprivate if read outside the loop
foo = tmp
# The else: block can be added, but actually has no effect
# as it is always executed --- the code here could as well
# be written after the for loop
foo = tmp # <- same result
with nogil:
# Suppose Cython allowed cdef inside blocks with usual scoping
# rules
cdef cython.parallel.reduction[double] r = 0
# the same variables can be used again in a second parallel loop
for i in prange(x.shape[0]):
r += 1.5
s -= i
tmp = 9
# also the iteration variable is available after the loop
count = i
# As per usual Cython scoping rules
return r, s
What did I miss here? As far as I see, the above would have the same
semantics and scoping as a single-threaded Python implementation.
The only change required to make things parallel is replacing range() by
prange() and adding the variable declarations.
--
Pauli Virtanen
More information about the cython-devel
mailing list