On Sun, 26 Jan 2014 10:52:46 +1100, Juan Nunez-Iglesias wrote:
I do remember these, but it's a worthwhile conversation to bring up every so often as tools mature and more people contribute. To be honest I know next to nothing about CUDA and GPU programming, I just see the success stories fly by. =)
Those are the only stories people like to tell, but they forget to mention that GPU implementations often need to be tweaked for each platform--and tweaks can be fairly extensive, including buffer and pool sizes, algorithm configurations etc. It is not the kind of complication I would like to deal with--but, like you say, it is good to evaluate this from time to time. Stéfan
On Sun, 26 Jan 2014 01:01:49 +0100 Stéfan van der Walt <stefan@sun.ac.za> wrote:
On Sun, 26 Jan 2014 10:52:46 +1100, Juan Nunez-Iglesias wrote:
I do remember these, but it's a worthwhile conversation to bring up every so often as tools mature and more people contribute. To be honest I know next to nothing about CUDA and GPU programming, I just see the success stories fly by. =)
Those are the only stories people like to tell, but they forget to mention that GPU implementations often need to be tweaked for each platform--and tweaks can be fairly extensive, including buffer and pool sizes, algorithm configurations etc. It is not the kind of complication I would like to deal with--but, like you say, it is good to evaluate this from time to time.
Hi Stefan, While I can confirm that best performances can only got with optimized kernels, our OpenCL implementation of SIFT is able to chose the right implementation on runtime depending on the architecture and on the compute-capabilities of the selected device ... Moreover, devices are becoming more and more clever (since Fermi, there is a memory cache !) Cheers, -- Jerome Kieffer <google@terre-adelie.org>
On Sun, 26 Jan 2014 08:22:11 +0100, Jerome Kieffer wrote:
While I can confirm that best performances can only got with optimized kernels, our OpenCL implementation of SIFT is able to chose the right implementation on runtime depending on the architecture and on the compute-capabilities of the selected device ...
As with any technology, you require the appropriate technicians for maintenance. We do not have particularly strong GPU abilities in our group, and I don't like the idea of "outsourcing" scikit-image maintenance until the developers can all catch up. Stéfan
participants (2)
-
Jerome Kieffer
-
Stéfan van der Walt