[Neuroimaging] parallel computation of bundle_distances_mam/mdf ?
Stephan Meesters
stephan.meesters at gmail.com
Wed Dec 14 16:04:34 EST 2016
Hi,
The instructions at http://nipy.org/dipy/installation.html#openmp-with-osx are
outdated since the clang-omp formula does not exist anymore.
With the release of Clang 3.8.0 (08 Mar 2016), OpenMP 3.1 support is
enabled in Clang by default. You will need the -fopenmp=libomp flag while
building.
I started a while back on a DIPY homebrew formula to allow installation via
"brew install dipy", see
https://github.com/Homebrew/homebrew-python/pull/310. However it was based
around the deprecated clang-omp and I didn't get around to fix it. Might
look into it soon if I find some spare time to tweak the formula.
Regards,
Stephan
2016-12-14 17:14 GMT+01:00 Samuel St-Jean <stjeansam at gmail.com>:
> That also depends on which version of clang ships by default with osx, in
> which case you have to play around with it to get a new one. I think it
> starts with clang 3.7 to have openmp support (I only ever tried Mac osx
> 10.9, so anyone more experienced can chime in), but everything older has to
> go through the hombebrew gcc install and company. Might be worhtwhile to
> check if openmp support is out of the box now, and starting on which mac
> osx versions, since older ones could be problematic for first time
> installs.
>
> I also have to admit I have no idea how old is old in the mac world, so
> maybe 10.9 is already phased out by now, but it was a hard and time
> consuming building around stuff with homebrew experience (and again, first
> time I used a mac, so, I guess the average user would also have some
> issues).
>
> Samuel
>
> 2016-12-14 16:51 GMT+01:00 Eleftherios Garyfallidis <elef at indiana.edu>:
>
>>
>> Hi Emanuele,
>>
>> My understanding is that openmp was only temporarily not available when
>> clang replaced gcc in osx.
>>
>> So, I would suggest to go ahead with openmp. Any current installation
>> issues are only temporarily for osx.
>> Openmp gives us a lot of capability to play with shared memory and it is
>> a standard that will be around
>> for very long time. Also, the great integration in cython makes the
>> algorithms really easy to read.
>> So, especially for this project my recommendation is to use openmp rather
>> than multiprocessing. All the way! :)
>>
>> I am CC'ing Stephan who wrote the instructions for osx. I am sure he can
>> help you with this. I would also suggest
>> to check if xcode provides any new guis for enabling openmp. I remember
>> there was something for that.
>>
>> Laterz!
>> Eleftherios
>>
>>
>>
>>
>> On Wed, Dec 14, 2016 at 6:29 AM Emanuele Olivetti <olivetti at fbk.eu>
>> wrote:
>>
>>> Hi Eleftherios,
>>>
>>> Thank you for pointing me to the MDF example. From what I see the Cython
>>> syntax is not complex, which is good.
>>>
>>> My only concern is the availability of OpenMP in the systems where DiPy
>>> is used. On a reasonably recent GNU/Linux machine it seems straightforward
>>> to have libgomp and the proper version of gcc. On other systems - say OSX -
>>> the situation is less clear to me. According to what I read here
>>> http://nipy.org/dipy/installation.html#openmp-with-osx
>>> the OSX installation steps are not meant for standard end users. Are
>>> those instructions updated?
>>> As a test of that, we've just tried to skip the steps described above
>>> and instead to install gcc with conda on OSX ("conda install gcc"). In the
>>> process, conda installed the recent gcc-4.8 with libgomp, which seems good
>>> news. Unfortunately, when we tried to compile a simple example of Cython
>>> code using parallelization (see below), the process failed (fatal error:
>>> limits.h : no such file or directory)...
>>>
>>> For the reasons above, I am wondering whether the very simple solution
>>> of using the "multiprocessing" module, available from the standard Python
>>> library, may be an acceptable first step towards the more efficient
>>> multithreading of Cython/libgomp. With "multiprocessing", there is no extra
>>> dependency on libgomp, or recent gcc or else. Moreover, multiprocessing
>>> does not require to have Cython code, because it works on plain Python too.
>>>
>>> Best,
>>>
>>> Emanuele
>>>
>>> ---- test.pyx ----
>>> from cython import parallel
>>> from libc.stdio cimport printf
>>>
>>> def test_func():
>>> cdef int thread_id = -1
>>> with nogil, parallel.parallel(num_threads=10):
>>> thread_id = parallel.threadid()
>>> printf("Thread ID: %d\n", thread_id)
>>> -----
>>>
>>> ----- setup.py -----
>>> from distutils.core import setup, Extension
>>> from Cython.Build import cythonize
>>>
>>> extensions = [Extension(
>>> "test",
>>> sources=["test.pyx"],
>>> extra_compile_args=["-fopenmp"],
>>> extra_link_args=["-fopenmp"]
>>> )]
>>>
>>> setup(
>>> ext_modules = cythonize(extensions)
>>> )
>>> ----
>>> python setup.py build_ext --inplace
>>>
>>> On Tue, Dec 13, 2016 at 11:17 PM, Eleftherios Garyfallidis <
>>> elef at indiana.edu> wrote:
>>>
>>> Hi Emanuele,
>>>
>>> Here is an example of how we calculated the distance matrix in parallel
>>> (for the MDF) using OpenMP
>>> https://github.com/nipy/dipy/blob/master/dipy/align/bundlemin.pyx
>>>
>>> You can just add another function that does the same using mam. It
>>> should be really easy to implement as we have
>>> already done it for the MDF for speeding up SLR.
>>>
>>> Then we need to update the bundle_distances* functions to use the
>>> parallel versions.
>>>
>>> I'll be happy to help you with this. Let's try to schedule some time to
>>> look at this together.
>>>
>>> Best regards,
>>> Eleftherios
>>>
>>>
>>> On Mon, Dec 12, 2016 at 11:16 AM Emanuele Olivetti <olivetti at fbk.eu>
>>> wrote:
>>>
>>> Hi,
>>>
>>> I usually compute the distance matrix between two lists of streamlines
>>> using bundle_distances_mam() or bundle_distances_mdf(). When the lists are
>>> large, it is convenient and easy to exploit the multiple cores of the CPU
>>> because such computation is intrinsically (embarassingly) parallel. At the
>>> moment I'm doing it through the multiprocessing or the joblib modules,
>>> because I cannot find a way directly from DiPy, at least according to what
>>> I see in dipy/tracking/distances.pyx . But consider that I am not
>>> proficient in cython.parallel.
>>>
>>> Is there a preferable way to perform such parallel computation? I plan
>>> to prepare a pull request in future and I'd like to be on the right track.
>>>
>>> Best,
>>>
>>> Emanuele
>>>
>>> _______________________________________________
>>> Neuroimaging mailing list
>>> Neuroimaging at python.org
>>> https://mail.python.org/mailman/listinfo/neuroimaging
>>>
>>>
>>> _______________________________________________
>>> Neuroimaging mailing list
>>> Neuroimaging at python.org
>>> https://mail.python.org/mailman/listinfo/neuroimaging
>>>
>>>
>>> _______________________________________________
>>> Neuroimaging mailing list
>>> Neuroimaging at python.org
>>> https://mail.python.org/mailman/listinfo/neuroimaging
>>>
>>
>> _______________________________________________
>> Neuroimaging mailing list
>> Neuroimaging at python.org
>> https://mail.python.org/mailman/listinfo/neuroimaging
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/neuroimaging/attachments/20161214/9eded981/attachment.html>
More information about the Neuroimaging
mailing list