Dear Pythonistas, We are porting the SIFT keypoints extraction algorithm (available from IPOL) to GPU using PyOpenCL. For the moment, the keypoint location works and shows a speed-up of 5 to 10x (without tuning so far, vs C++). A lot of work is remaining, especially: * limit the memory footprint (700MB/10Mpix image currently) * calculate the descriptor for each descriptor * keypoint matching and image alignment. * best interleave of IO/CPU/GPU but we managed to port the most trickiest part to OpenCL (without using textures, which makes it running also on multi-core). I would like to thank the people who published their algorithm on IPOL; making unit testing possible. Last but not least, the code is open source and should have a BSD licence (even if there is a patent on the algorithm in the USA). https://github.com/pierrepaleo/sift_pyocl Cheers, -- Jérôme Kieffer <google@terre-adelie.org>
Hi Jérôme, I cloned the repo and tried running test_all.py, Seems there are a couple bugs in test_image_functions.py that prevent it from executing properly. Is there an example somewhere that I can play with/ Cheers, Marc On Wednesday, June 12, 2013 10:51:02 PM UTC+2, Jerome Kieffer wrote:
Dear Pythonistas,
We are porting the SIFT keypoints extraction algorithm (available from IPOL) to GPU using PyOpenCL. For the moment, the keypoint location works and shows a speed-up of 5 to 10x (without tuning so far, vs C++).
A lot of work is remaining, especially: * limit the memory footprint (700MB/10Mpix image currently) * calculate the descriptor for each descriptor * keypoint matching and image alignment. * best interleave of IO/CPU/GPU but we managed to port the most trickiest part to OpenCL (without using textures, which makes it running also on multi-core).
I would like to thank the people who published their algorithm on IPOL; making unit testing possible.
Last but not least, the code is open source and should have a BSD licence (even if there is a patent on the algorithm in the USA). https://github.com/pierrepaleo/sift_pyocl
Cheers,
-- Jérôme Kieffer <goo...@terre-adelie.org <javascript:>>
Dear Mark, On Fri, 14 Jun 2013 02:36:39 -0700 (PDT) Marc de Klerk <deklerkmc@gmail.com> wrote:
I cloned the repo and tried running test_all.py, Seems there are a couple bugs in test_image_functions.py that prevent it from executing properly.
This is highly possible: we still have a small differences in the number of keypoints with C++ implementation. moreover the keypoint localization can vary up to 1 pixel (to be multiplied by the number of octave). This looks like a rounding error but we did not spot it.
Is there an example somewhere that I can play with/ get the reference implementation:
git clone -branch numpy git://github.com/kif/imageAlignment.git cd imageAlignment python setup.py build sudo python setup.py install #or modify your PYTHONPATH cd .. git clone git://github.com/kif/sift_pyocl.git cd sift_pyocl/test python test_all.py # I got (failures=2, errors=2, mainly because API changed faster than tests) python crash.py This should show you keypoints (red and blue arrows represents the orientation and the scale, in green are our errors) Tell me if you are doing progress (or not). Cheers, -- Jérôme Kieffer <google@terre-adelie.org>
On Fri, 14 Jun 2013 02:36:39 -0700 (PDT) Marc de Klerk <deklerkmc@gmail.com> wrote:
Hi Jérôme,
I cloned the repo and tried running test_all.py, Seems there are a couple bugs in test_image_functions.py that prevent it from executing properly.
Is there an example somewhere that I can play with/
Hi Marc, We have fixed most tests ... under linux+pyopencl+GPU nvidia (fermi+kepler) * With AMD/intel on CPU driver some tests don't pass (but the library is functional and working) * With NVidia GT200 few kernel crashes (but the library is functional and working) * With Nvidia 9600 many kernel are crashing but the library is able to use CPU kernels * With elder nvidia cards where atomic operation do not exist at all, no way to get it working. The problem we encounter is that kernel designed for GPU do not behave properly under CPU and vice-versa. This is fully untested with other platforms like windows or macosX and with ATI graphic cards but feed-back would be welcome. To run the tests: run test/test_all.py to have a small demo: run test/demo_match.py There is comprehensive sphinx doc. The repository should now be: https://github.com/kif/sift_pyocl Cheers, -- Jerome Kieffer <google@terre-adelie.org>
Hi Jerome. Have you benched against vl_feat. They have pretty optimized C code, and I think it would be very interesting to see if you are faster. Their code is also BSD, btw. Cheers, Andy On 06/12/2013 10:51 PM, Jérôme Kieffer wrote:
Dear Pythonistas,
We are porting the SIFT keypoints extraction algorithm (available from IPOL) to GPU using PyOpenCL. For the moment, the keypoint location works and shows a speed-up of 5 to 10x (without tuning so far, vs C++).
A lot of work is remaining, especially: * limit the memory footprint (700MB/10Mpix image currently) * calculate the descriptor for each descriptor * keypoint matching and image alignment. * best interleave of IO/CPU/GPU but we managed to port the most trickiest part to OpenCL (without using textures, which makes it running also on multi-core).
I would like to thank the people who published their algorithm on IPOL; making unit testing possible.
Last but not least, the code is open source and should have a BSD licence (even if there is a patent on the algorithm in the USA). https://github.com/pierrepaleo/sift_pyocl
Cheers,
On Tue, 10 Sep 2013 09:43:24 +0200 Andreas Mueller <amueller@ais.uni-bonn.de> wrote:
Hi Jerome. Have you benched against vl_feat. They have pretty optimized C code, and I think it would be very interesting to see if you are faster. Their code is also BSD, btw.
Hello Andreas, Thanks for the link. This is the first time I use matlab ... so the comparison is likely to be unfair: I = imread(fullfile(vl_root,'data','roofs1.jpg')) ; t=cputime;[f,d] = vl_sift(single(rgb2gray(I))) ;e=cputime-t e=0.9600 under python: In [1]: import scipy.misc,sift In [2]: img = scipy.misc.imread("roofs1.jpg") In [3]: sift_gpu = sift.SiftPlan(template=img,devicetype="GPU") In [4]: %timeit kp = sift_gpu.keypoints(img) 10 loops, best of 3: 87 ms per loop In [5]: sift_cpu = sift.SiftPlan(template=img,devicetype="CPU") #selects Intel driver In [6]: %timeit kp = sift_cpu.keypoints(img) 1 loops, best of 3: 216 ms per loop In [7]: sift_cpu_amd = sift.SiftPlan(template=img,device=(1,0)) #selects AMD driver, computer specific In [8]: %timeit kp = sift_cpu_amd.keypoints(img) 1 loops, best of 3: 225 ms per loop The computer is a dual Intel(R) Xeon(R) CPU E5-2643 0 @ 3.30GHz (fast) but with a moderate graphics card Quadro 2000 On my GeForce Titan (GK110) I got: 10 loops, best of 3: 38.7 ms per loop A rough and unfair comparison would say our code is 25x faster; but the test has been made on a rather small image (640x478) which does not allow the GPU to express it's speed. On the other hand if the could would have run on my computer it would be worse (my CPU is only 2.2GHz). You says vl_feet is optimized, it looks slower than the one from IPOL wrapped under python https://github.com/kif/imageAlignment In [9]: import feature In [10]: %timeit feature.sift_keypoints(img.max(axis=-1)) 1 loops, best of 3: 687 ms per loop Comments are welcome. Cheers, -- Jerome Kieffer <google@terre-adelie.org>
participants (4)
-
Andreas Mueller
-
Jerome Kieffer
-
Jérôme Kieffer
-
Marc de Klerk