
I had another look at the definition of "ones" and of another routine I frequently use: arange. It appears that even without rewriting them in C, some speedup can be achieved: - in ones(), the + 1 should be done "in place", saving about 15%, more if you run out of processor cache: amigo[167]~%3% /usr/local/bin/python test_ones.py Numeric.ones 10 -> 0.098ms Numeric.ones 100 -> 0.103ms Numeric.ones 1000 -> 0.147ms Numeric.ones 10000 -> 0.830ms Numeric.ones 100000 -> 11.900ms Numeric.zeros 10 -> 0.021ms Numeric.zeros 100 -> 0.022ms Numeric.zeros 1000 -> 0.026ms Numeric.zeros 10000 -> 0.290ms Numeric.zeros 100000 -> 4.000ms Add inplace 10 -> 0.091ms Add inplace 100 -> 0.094ms Add inplace 1000 -> 0.127ms Add inplace 10000 -> 0.690ms Add inplace 100000 -> 8.100ms Reshape 1 10 -> 0.320ms Reshape 1 100 -> 0.436ms Reshape 1 1000 -> 1.553ms Reshape 1 10000 -> 12.910ms Reshape 1 100000 -> 141.200ms Also notice that zeros() is 4-5 times faster than ones(), so it may pay to reimplement ones in C as well (it is used in indices() and arange()). The "resize 1" alternative is much slower. - in arange, additional 10% can be saved by adding brackets around (start+(stop-stop)) (in addition to the gain by the faster "ones"): amigo[168]~%3% /usr/local/bin/python test_arange.py Numeric.arange 10 -> 0.390ms Numeric.arange 100 -> 0.410ms Numeric.arange 1000 -> 0.670ms Numeric.arange 10000 -> 4.100ms Numeric.arange 100000 -> 59.000ms Optimized 10 -> 0.340ms Optimized 100 -> 0.360ms Optimized 1000 -> 0.580ms Optimized 10000 -> 3.500ms Optimized 100000 -> 48.000ms Regards, Rob Hooft -- ===== rob@hooft.net http://www.hooft.net/people/rob/ ===== ===== R&D, Nonius BV, Delft http://www.nonius.nl/ ===== ===== PGPid 0xFA19277D ========================== Use Linux! =========