
Hi, I'm doing some timings on array indexing feature. It's quite shocking to me that, provided empty() and arange() are faster in scipy_core than its counterparts in numarray: In [110]: t1 = timeit.Timer('a=empty(shape=10000);a=arange(10000)','from scipy.base import empty, arange') In [111]: t1.repeat(3,10000) Out[111]: [0.74018502235412598, 0.76141095161437988, 0.71947312355041504] In [112]: t2 = timeit.Timer('a=array(None,shape=10000);a=arange(10000)','from numarray import array, arange') In [113]: t2.repeat(3,10000) Out[113]: [2.3724348545074463, 2.4109888076782227, 2.3820669651031494] however, the next code seems to be slower in scipy_core: In [114]: t3 = timeit.Timer('a=empty(shape=10000);a[arange(10000)]','from scipy.base import empty, arange') In [115]: t3.repeat(3,1000) Out[115]: [3.5126161575317383, 3.5309510231018066, 3.5558919906616211] In [116]: t4 = timeit.Timer('a=array(None,shape=10000);a[arange(10000)]','from numarray import array, arange') In [117]: t4.repeat(3,1000) Out[117]: [2.0824751853942871, 2.1258058547973633, 2.0946059226989746] It seems like if the index array feature can be further optimized in scipy_core. --
0,0< Francesc Altet http://www.carabos.com/ V V Cárabos Coop. V. Enjoy Data "-"

Francesc Altet wrote:
Hi,
In [114]: t3 = timeit.Timer('a=empty(shape=10000);a[arange(10000)]','from scipy.base import empty, arange')
In [115]: t3.repeat(3,1000) Out[115]: [3.5126161575317383, 3.5309510231018066, 3.5558919906616211]
In [116]: t4 = timeit.Timer('a=array(None,shape=10000);a[arange(10000)]','from numarray import array, arange')
In [117]: t4.repeat(3,1000) Out[117]: [2.0824751853942871, 2.1258058547973633, 2.0946059226989746]
It seems like if the index array feature can be further optimized in scipy_core.
Thank you very much for your timings. It is important to get things as fast as we can, especially with all the good code in numarray to borrow from. I am really committed to making scipy_core as fast as it can be. I believe you are right about the indexing. I will look more closely at this. The indexing code is still "first-cut" and has not received any optimization attention. It would be good to look at 2- and 3-D timings, however, to see if the speed up here is a 1-d optimization that scipy_core is not doing. Best regards, -Travis

Francesc Altet wrote:
however, the next code seems to be slower in scipy_core:
In [114]: t3 = timeit.Timer('a=empty(shape=10000);a[arange(10000)]','from scipy.base import empty, arange')
In [115]: t3.repeat(3,1000) Out[115]: [3.5126161575317383, 3.5309510231018066, 3.5558919906616211]
In [116]: t4 = timeit.Timer('a=array(None,shape=10000);a[arange(10000)]','from numarray import array, arange')
In [117]: t4.repeat(3,1000) Out[117]: [2.0824751853942871, 2.1258058547973633, 2.0946059226989746]
It seems like if the index array feature can be further optimized in scipy_core.
I just did some simple tests. It looks like 2-d indexing is quite a bit faster in scipy_core. Try these: t4 = timeit.Timer('a=array(None,shape=(1000,1000));a[arange(1000),arange(1000)]','from numarray import array, arange') t3 = timeit.Timer('a=empty(shape=(1000,1000));a[arange(1000),arange(1000)]','from scipy.base import empty, arange') My results:
t3.repeat(3,100) [0.18409419059753418, 0.19265508651733398, 0.18711185455322266]
t4.repeat(3,100) [4.0139532089233398, 3.9884538650512695, 4.405332088470459]
However, as you noticed 1-d indexing is a bit slower. However if you use flat indexing (which is special-cased) it is faster: Thus: t4 = timeit.Timer('a=array(None,shape=10000);a[arange(10000)]','from numarray import array, arange') t3 = timeit.Timer('a=empty(shape=10000);a.flat[arange(10000)]','from scipy.base import empty, arange') Gives:
t3.repeat(3,100) [0.18227100372314453, 0.16614699363708496, 0.16269397735595703]
t4.repeat(3,100) [0.40496301651000977, 0.34369301795959473, 0.3347930908203125]
Thus, I think it might be wise to use the flattened indexing code when the array is already 1-d. I could add this in automatically. -Travis

Francesc Altet wrote:
Hi,
I'm doing some timings on array indexing feature. It's quite shocking to me that, provided empty() and arange() are faster in scipy_core than its counterparts in numarray:
In [110]: t1 = timeit.Timer('a=empty(shape=10000);a=arange(10000)','from scipy.base import empty, arange')
In [111]: t1.repeat(3,10000) Out[111]: [0.74018502235412598, 0.76141095161437988, 0.71947312355041504]
In [112]: t2 = timeit.Timer('a=array(None,shape=10000);a=arange(10000)','from numarray import array, arange')
In [113]: t2.repeat(3,10000) Out[113]: [2.3724348545074463, 2.4109888076782227, 2.3820669651031494]
however, the next code seems to be slower in scipy_core:
In [114]: t3 = timeit.Timer('a=empty(shape=10000);a[arange(10000)]','from scipy.base import empty, arange')
In [115]: t3.repeat(3,1000) Out[115]: [3.5126161575317383, 3.5309510231018066, 3.5558919906616211]
In [116]: t4 = timeit.Timer('a=array(None,shape=10000);a[arange(10000)]','from numarray import array, arange')
In [117]: t4.repeat(3,1000) Out[117]: [2.0824751853942871, 2.1258058547973633, 2.0946059226989746]
I added a special-case for 1-d indexing that goes through the same code as a.flat would. The result seems to show a nice speed up for your test case. This is not to say that the indexing code could not be made faster, but that would require more study. Right now, the multidimensional indexing code is fairly clean as it uses the abstraction of an iterator (which also makes it hard to figure out how to make it faster). I've been curious as to how fast the result is. The results on 2-d indexing are encouraging. They show the scipy_core code to be faster than the 2-d indexing of numarray (as far as I can tell). Of course these things can usually be made better, so I'm hesitant to say we've arrived. -Travis
participants (2)
-
Francesc Altet
-
Travis Oliphant