[Numpy-discussion] Speed bottlenecks on simple tasks - suggested improvement

Mon Dec 3 17:12:57 EST 2012

Chris,

thanks for the feedback,

fyi,
the minor changes I talked about have different performance enhancements 
depending on scenario,

e.g,

1) Array * Array
point = array( [2.0, 3.0])
scale = array( [2.4, 0.9] )

retVal = point * scale
#The line above runs 1.1 times faster with my new code (but it runs 3 
times faster in Numeric in Python 2.2)
#i.e. pretty meaningless but still far from old Numeric

2) Array * Tuple (item by item)
point = array( [2.0, 3.0])
scale =  (2.4, 0.9 )

retVal = point[0] < scale[0], point[1] < scale[1]
#The line above runs 1.8 times faster with my new code (but it runs 6.8 
times faster in Numeric in Python 2.2)
#i.e. pretty decent speed up but quite far from old Numeric

I am not saying that I would ever do something exactly like (2) in my 
code nor am I saying that the changes in NumPy Vs Numeric are not 
beneficial. My point is that performance in small size problems is 
fairly far from what it used to be in Numeric particularly when dealing 
with scalars and it is problematic at least to me.

I am currently looking around to see if there are practical ways to 
speed things up without slowing anything else down. Will keep you posted.

regards,

Raul

On 03/12/2012 12:49 PM, Chris Barker - NOAA Federal wrote:
> Raul,
>
> Thanks for doing this work -- both the profiling and actual
> suggestions for how to improve the code -- whoo hoo!
>
> In general, it seem that numpy performance for scalars and very small
> arrays (i.e (2,), (3,) maybe (3,3), the kind of thing that you'd use
> to hold a coordinate point or the like, not small as in "fits in
> cache") is pretty slow. In principle, a basic array scalar operation
> could be as fast as a numpy native numeric type, and it would be great
> is small array operations were, too.
>
> It may be that the route to those performance improvements is
> special-case code, which is ugly, but I think could really be worth it
> for the common types and operations.
>
> I'm really out of my depth for suggesting (or contributing) actual
> soluitons, but +1 for the idea!
>
> -Chris
>
> NOTE: Here's a example of what I'm talking about -- say you are
> scaling an (x,y) point by a (s_x, s_y) scale factor:
>
> def numpy_version(point, scale):
>      return point * scale
>
>
> def tuple_version(point, scale):
>      return (point[0] * scale[0], point[1] * scale[1])
>
>
> In [36]: point_arr, sca
> scale      scale_arr
>
> In [36]: point_arr, scale_arr
> Out[36]: (array([ 3.,  5.]), array([ 2.,  3.]))
>
> In [37]: timeit tuple_version(point, scale)
> 1000000 loops, best of 3: 397 ns per loop
>
> In [38]: timeit numpy_version(point_arr, scale_arr)
> 100000 loops, best of 3: 2.32 us per loop
>
> It would be great if numpy could get closer to tuple performance for
> this sor tof thing...
>
>
> -Chris
>
>