[Numpy-discussion] speeding up operations on small vectors

Tue Oct 11 07:12:34 EDT 2011

Here's a version that uses less Python loops and thus is faster. What still
takes time is the array creation (np.array(...)), I'm not sure exactly why.
It may be possible to speed it up.

def points_numpy(radius):
   rr = radius**2
   M = np.identity(2, dtype=int)
   x_y = np.array(list(itertools.product(xrange(-radius, radius + 1),
                                         xrange(-radius, radius + 1))))
   x_y_M = np.dot(x_y, M)
   is_inside = (x_y_M ** 2).sum(axis=1) < rr
   for point in x_y_M[is_inside]:
       yield point

-=- Olivier

2011/10/11 Christoph Groth <cwg at falma.de>

> Dear numpy experts,
>
> I could not find a satisfying solution to the following problem, so I
> thought I would ask:
>
> In one part of a large program I have to deal a lot with small (2d or
> 3d) vectors and matrices, performing simple linear algebra operations
> with them (dot products and matrix multiplications).  For several
> reasons I want the code to be in python.
>
> The most natural choice seemed to be to use numpy.  However, the
> constant time cost when dealing with numpy arrays seems to be immense,
> as demonstrated by this toy program:
>
> ****************************************************************
> import numpy as np
> from time import time
>
> def points_tuple(radius):
>    rr = radius**2
>    def inside(point):
>        return point[0]**2 + point[1]**2 < rr
>
>    M = ((1, 0), (0, 1))
>    for x in xrange(-radius, radius + 1):
>        for y in xrange(-radius, radius + 1):
>            point = (M[0][0] * x + M[0][1] * y,
>                     M[1][0] * x + M[1][1] * y)
>            if inside(point):
>                yield point
>
> def points_numpy(radius):
>    rr = radius**2
>    def inside(point):
>        return np.dot(point, point) < rr
>
>    M = np.identity(2, dtype=int)
>    for x in xrange(-radius, radius + 1):
>        for y in xrange(-radius, radius + 1):
>            point = np.dot(M, (x, y))
>            if inside(point):
>                yield point
>
> def main():
>    r = 200
>    for func in [points_tuple, points_numpy]:
>        t = time()
>        for point in func(r):
>            pass
>        print func.__name__, time() - t, 'seconds'
>
> if __name__ == '__main__':
>    main()
> ****************************************************************
>
> On my trusty old machine the output (python 2.6, numpy 1.5.1) is:
>
> points_tuple 0.36815404892 seconds
> points_numpy 6.20338392258 seconds
>
> I do not need C performance here, but the latter is definitely too slow.
>
> In the real program it's not so easy (but possible) to use tuples
> because the code is dimension-independent.  Before considering writing
> an own "small vector" module, I'd like about other possible solutions.
> Other people must have stumbled across this before!
>
> Thanks,
> Christoph
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20111011/40f298ff/attachment.html>