Power optimization branch merged to Numpy trunk

Since Tim and I are now fiddling around with numexpr, I've merged what we've got with the power optimization to the Numpy trunk. This includes: * x**2 is _as_fast_ as x*x now * optimization of x**n for x as an array and n as a Python scalar. * x**2, x**(-1), x**0 are optimized to use the new square, reciprocal, and ones_like ufuncs * complex integer powers should be faster Also, PyArray_FromAny has been speeded up when converting Python scalars. A side effect of this change is that the array protocol won't be looked for on subclasses of Python scalars (int, long, float, complex). I don't expect that to be a problem. Some things we hadn't got around to: - integer arrays to a power (currently, these use pow()) - more optimization of integer powers enjoy the speed :-) -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm@physics.mcmaster.ca

Since you are mentioning some improvements (I have confirmed them!) I would like to ask what code are you using for benchmarking numpy. I wanted to know what code was faster: sum(|x|)/N or sqrt(sum(x**2))/sqrt(N) I wrote a timeit test and discovered that using the function dot(x,x) to compute sum(x**2) gives the best result and is one order of magnitude faster than sum(|x|). I found that sum and absolute are approx. 5 times slower than dot. I also noticed that the std function is slightly slower than a python implementation code. These are the results for some operations in a randn vector of size 1000: 48.9+-22.6us: np.dot(v,v) 628.8+-87.3us: np.sum(np.absolute(v)) 518.3+-65.9us: np.sum(v*v) 118.0+-1.4us: v*v 234.4+-56.5us: np.absolute(v) 234.3+-56.2us: v.sum() 1175.2+-58.6us: v.std() 969.6+-77.4us: vv=np.mean(v)-v;(np.sqrt(np.dot(vv,vv)/(len(v)-1))) The code for timming the numpy operations: ################# import timeit import numpy as np p= """ import numpy as np vsize=1000 v=np.randn(vsize) """ vs=[ 'np.dot(v,v)', 'np.sum(np.absolute(v))', 'np.sum(v*v)', 'v*v', 'np.absolute(v)', 'v.sum()', 'v.std()', 'vv=np.mean(v)-v;(np.sqrt(np.dot(vv,vv)/(len(v)-1)))'] ntests=1000 tries=5 for s in vs: t = timeit.Timer(s,p) r=(np.array(t.repeat(tries,ntests))/ntests) * 10.0e6 print ("%10.1f+-%3.1fus: %s " % (np.mean(r), np.std(r) ,s)) ################# If some one have a complete benchmarking set of functions I would like to use them. Thanks. Hugo Gamboa
participants (2)
-
cookedm@physics.mcmaster.ca
-
Hugo Gamboa