Martin Ünsal <martinunsal <at> gmail.com> writes:
I was wondering if anyone has thought about accelerating NumPy with a GPU. For example nVidia's CUDA SDK provides a feasible way to offload vector math onto the very fast SIMD processors available on the GPU. Currently GPUs primarily support single precision floats and are not IEEE compliant, but still could be useful for some applications.
If there turns out to be a significant speedup over using the CPU, this could be a very accessible way to do scientific and numerical computation using GPUs, much easier than coding directly to the GPU APIs.
Martin
I've thought about this too and think that it's a great idea. The existing library Brook, which has a similar programming model to NumPy, proves that it's feasible. And Brook was originally done with OpenGL & DirectX as backends to access the hardware. Needless to say, that's a lot harder than using CUDA. Since it hasn't already been pointed out, CUDA includes the cuBLAS and cuFFT libraries. I don't what the status of a LAPACK built on top of the cuBLAS is, but I'd be surprised if someone isn't already working on it. Also, NVIDIA has stated that double-precision hardware will be available later this year, in case that's an issue for anyone. I agree very much that it would make the GPUs more accessible, although CUDA has done an amazing job at that already. I think the most helpful thing about this would be if it allowed us to code using the existing array interface from NumPy in a way that the code automatically runs on the GPU in an optimized way - using shared memory + avoiding bank conflicts. I'd happily contribute to such a project if someone else got it started.