[SciPy-user] threading and scipy

Damian Eads eads at soe.ucsc.edu
Sun May 18 18:02:32 EDT 2008


Hi there,

I am running some of my code through a very large data set on quad-core 
cluster nodes. A simple grep confirms that most parts of Scipy (e.g. 
linalg) do not use the Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS 
macros (or the numpy equivalents).

[eads at pumpkin scipy]$ grep ALLOW_THREADS `find ~/work/repo/scipy -name 
"*.[ch]*"` | grep -v "build" | sed 's/:.*//g' | sort | uniq
/home/eads/work/repo/scipy/scipy/sandbox/netcdf/_netcdf.c
/home/eads/work/repo/scipy/scipy/sandbox/netcdf/.svn/text-base/_netcdf.c.svn-base
/home/eads/work/repo/scipy/scipy/stsci/convolve/src/_lineshapemodule.c
/home/eads/work/repo/scipy/scipy/stsci/convolve/src/.svn/text-base/_lineshapemodule.c.svn-base

Numpy seems to have a lot more coverage, though.

[eads at pumpkin scipy]$ grep ALLOW_THREADS `find ~/work/numpy-1.0.4/numpy 
-name "*.c"` | sed 's/:.*//g' | sort | uniq
/home/eads/work/numpy-1.0.4/numpy/core/blasdot/_dotblas.c
/home/eads/work/numpy-1.0.4/numpy/core/src/arrayobject.c
/home/eads/work/numpy-1.0.4/numpy/core/src/multiarraymodule.c

[eads at pumpkin scipy]$

Is it true if my code is heavily dependent on Scipy (I do image 
processing on large images with ndimage) and I use the %bg command in 
IPython, most of the time there will be only one thread running with the 
others blocked?

I anticipate others will insist that I read up on the caveats of 
multi-threaded programming (mutual exclusion, locking, critical regions, 
etc.) so I should mention that I am a pretty seasoned with it, having 
done quite a bit of work with pthreads. However, I am new to threading 
in python and I heard there are issues, specifically only one thread is 
allowed access to the global interpreter lock at a time.

I would like to run some filters on 300 images. These filters change 
from one iteration to the next of the program. When all the filtering is 
finished, a single thread needs to see the result of all the computation 
(all the result images) to compute so inter-image statistics. Then, I 
start the process over. I'd really like to spawn 4+ threads, one each 
working on a different image.

Being that I don't see any code in ndimage that releases the global 
interpreter lock, is it true that if I wrote code to spawn separate 
filter threads, only one would execute at a time?

Please advise.

Thank you!

Damian



More information about the SciPy-User mailing list