[SciPy-user] threading and scipy
Damian Eads
eads at soe.ucsc.edu
Sun May 18 18:02:32 EDT 2008
Hi there,
I am running some of my code through a very large data set on quad-core
cluster nodes. A simple grep confirms that most parts of Scipy (e.g.
linalg) do not use the Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS
macros (or the numpy equivalents).
[eads at pumpkin scipy]$ grep ALLOW_THREADS `find ~/work/repo/scipy -name
"*.[ch]*"` | grep -v "build" | sed 's/:.*//g' | sort | uniq
/home/eads/work/repo/scipy/scipy/sandbox/netcdf/_netcdf.c
/home/eads/work/repo/scipy/scipy/sandbox/netcdf/.svn/text-base/_netcdf.c.svn-base
/home/eads/work/repo/scipy/scipy/stsci/convolve/src/_lineshapemodule.c
/home/eads/work/repo/scipy/scipy/stsci/convolve/src/.svn/text-base/_lineshapemodule.c.svn-base
Numpy seems to have a lot more coverage, though.
[eads at pumpkin scipy]$ grep ALLOW_THREADS `find ~/work/numpy-1.0.4/numpy
-name "*.c"` | sed 's/:.*//g' | sort | uniq
/home/eads/work/numpy-1.0.4/numpy/core/blasdot/_dotblas.c
/home/eads/work/numpy-1.0.4/numpy/core/src/arrayobject.c
/home/eads/work/numpy-1.0.4/numpy/core/src/multiarraymodule.c
[eads at pumpkin scipy]$
Is it true if my code is heavily dependent on Scipy (I do image
processing on large images with ndimage) and I use the %bg command in
IPython, most of the time there will be only one thread running with the
others blocked?
I anticipate others will insist that I read up on the caveats of
multi-threaded programming (mutual exclusion, locking, critical regions,
etc.) so I should mention that I am a pretty seasoned with it, having
done quite a bit of work with pthreads. However, I am new to threading
in python and I heard there are issues, specifically only one thread is
allowed access to the global interpreter lock at a time.
I would like to run some filters on 300 images. These filters change
from one iteration to the next of the program. When all the filtering is
finished, a single thread needs to see the result of all the computation
(all the result images) to compute so inter-image statistics. Then, I
start the process over. I'd really like to spawn 4+ threads, one each
working on a different image.
Being that I don't see any code in ndimage that releases the global
interpreter lock, is it true that if I wrote code to spawn separate
filter threads, only one would execute at a time?
Please advise.
Thank you!
Damian
More information about the SciPy-User
mailing list