Goodluck making something faster than ndimage without considerable effort (I've already tried in Cython and was 10x slower). I read the ndimage source and it goes to great lengths to make optimum use of the cpu cache by allocating line buffers etc... What we need is to find a fast open source routine, wrap it with Cython, and package it with the scikit. I wouldn't suggest using the one from OpenCV unless we are desperate because it's implemented as a c++ filtering engine. However, they are using sse2 intrinsics and it's fast! Like 10x+ faster than ndimage. 2011/4/19 Stéfan van der Walt <stefan@sun.ac.za>
2011/4/19 Stéfan van der Walt <stefan@sun.ac.za>:
Apparently, neither of these routines like integer images as inputs (should be mentioned in the docs). Here's the output for float images:
To be more specific, scipy.ndimage.convolve does not upcast appropriately when convolving integer arrays with floating point arrays. This alone seems like a good enough reason to have our own version, although I agree that we should use the 1D separated filters.
It doesn't look as if there is an easy way to coerce numpy.convolve to do the job, so I guess I should write something in Cython?
Regards Stéfan