[Numpy-discussion] speed of numpy.ndarray compared to Numeric.array
Thomas.EMMEL at 3ds.com
Mon Jan 10 08:54:50 EST 2011
> > def bruteForceSearch(points, point):
> > minpt = min([(vec2Norm(pt, point), pt, i)
> > for i, pt in enumerate(points)], key=itemgetter(0))
> > return sqrt(minpt), minpt, minpt
> > def vec2Norm(pt1,pt2):
> > xDis = pt1-pt2
> > yDis = pt1-pt2
> > zDis = pt1-pt2
> > return xDis*xDis+yDis*yDis+zDis*zDis
> > I have a more clever method but it still takes a lot of time in the
> > If you like I can attach a running example.
> if you use the vec2Norm function as you wrote it there, this code is
> not vectorized at all, and as such of course numpy would be slowest as
> it has the most overhead and no advantages for non vectorized code,
> you simply can't write python code like that and expect it to be fast
> for these kind of calculations.
> Your function should look more like this:
> import numpy as np
> def bruteForceSearch(points, point):
> dists = points - point
> # that may need point[None,:] or such for broadcasting to work
> dists *= dists
> dists = dists.sum(1)
> I = np.argmin(dists)
> return sqrt(dists[I]), points[I], I
> If points is small, this may not help much (though compared to this
> exact code my guess is it probably would), if points is larger it
> should speed up things tremendously (unless you run into RAM
> problems). It may be that you need to fiddle around with axes, I did
> not check the code.
> If this is not good enough for you (you will need to port it (and
> maybe the next outer loop as well) to Cython or write it in C/C++ and
> make sure it can optimize things right. Also I think somewhere in
> scipy there were some distance tools that may be already in C and nice
> fast, but not sure.
> I hope I got this right and it helps,
I see the point and it was very helpful to understand the behavior of the arrays a bit better. And your attempt improved the bruteForceSearch which is up to 6 times faster.
But in case of a leaf in a kd-tree you end up with 50, 20, 10 or less points where the speed-up is reversed. In this particular case 34000 runs take 90s with your method and 50s with mine (not the bruteForce).
I see now the limits of the arrays but of course I see the chances and - coming back to my original question - it seems that Numeric arrays were faster for my kind of application but they might be slower for larger amounts of data.
This email and any attachments are intended solely for the use of the individual or entity to whom it is addressed and may be confidential and/or privileged. If you are not one of the named recipients or have received this email in error, (i) you should not read, disclose, or copy it, (ii) please notify sender of your receipt by reply email and delete this email and all attachments, (iii) Dassault Systemes does not accept or assume any liability or responsibility for any use of or reliance on this email.For other languages, go to http://www.3ds.com/terms/email-disclaimer.
More information about the NumPy-Discussion