Performance issues with numarray.searchsorted()
![](https://secure.gravatar.com/avatar/81b3970c8247b2521d2f814de5b24475.jpg?s=120&d=mm&r=g)
Hi, I'm willing to use a lot the searchsorted function in numarray, but I'm a bit surprised about the poor performance of it. Look at that:
It may seem that Numeric is better optimised, but the standard python module bisect is even faster than numarray.searchsorted:
So, bisect performance is similar to that of Numeric searchsorted and both are almost an order of magnitude better than numarray searchsorted. This a little bit surprising as the bisect_left module is written in plain python. From the python 2.3.3 sources: def bisect_left(a, x, lo=0, hi=None): if hi is None: hi = len(a) while lo < hi: mid = (lo+hi)//2 if a[mid] < x: lo = mid+1 else: hi = mid return lo I'm using python2.3.3, numarray 0.9.1 (latest CVS) and Debian Linux (sid). Cheers, -- Francesc Alted
![](https://secure.gravatar.com/avatar/ec366db3649cf13f4061b519193849d6.jpg?s=120&d=mm&r=g)
Francesc Alted wrote:
A better timing (IMHO), but with similar conclusions: Python 2.3 (#1, Sep 13 2003, 00:49:11) [GCC 3.3 20030304 (Apple Computer, Inc. build 1495)] on darwin Type "help", "copyright", "credits" or "license" for more information.
[Apologies for the linewraps.] I also get similar results with double arrays. Weird. Python 2.3 on Mac OS X 10.3.sumthin', latest CVS checkout of numarray, Numeric 23.1. -- Robert Kern rkern@ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter
![](https://secure.gravatar.com/avatar/faf9400121dca9940496a7473b1d8179.jpg?s=120&d=mm&r=g)
On Thu, 2004-05-20 at 05:26, Robert Kern wrote:
benchmark i numarray (usec) Numeric (usec) numarray:Numeric searchsorted(a,p/5) 0.0 446 12 36.7 1.0 438 11 36.8 2.0 450 12 35.0 3.0 459 14 32.6 4.0 511 25 19.7 5.0 636 56 11.2 6.0 653 64 10.1 searchsorted(a,a) 0.0 285 5 48.0 1.0 283 7 39.6 2.0 291 29 9.8 3.0 368 308 1.2 4.0 1771 4120 0.4 5.0 17335 52127 0.3 6.0 201277 605787 0.3 p = 10**i a = arange(p) The first case agrees with your results of ~10x difference in favor of Numeric at 10**6 elements. The last case shows a ~3x numarray advantage given large numbers of values. My analysis is this: since searchsorted runs in O(log2(N)) time, even with 10**6 elements there are only 20 iterations or so. This is just not enough to overcome numarray's Python ufunc overhead. I'll see what I can do, but I think we're up against the standard numarray performance wall for small arrays. Regards, Todd
![](https://secure.gravatar.com/avatar/81b3970c8247b2521d2f814de5b24475.jpg?s=120&d=mm&r=g)
A Dijous 20 Maig 2004 19:13, vareu escriure:
I see. What about creating a special case for when the item to be searched is an scalar (and not an array)? In such a case, it would be theoretically possible to get at least the bisect.bisect() performance. Besides, if this special case would be implemented, users would be able to call repeatedly this scalar version of searchsorted() in order to deal with very small "values" arrays (len()<5) and still having a gain. Just a thought, -- Francesc Alted
![](https://secure.gravatar.com/avatar/ec366db3649cf13f4061b519193849d6.jpg?s=120&d=mm&r=g)
Francesc Alted wrote:
A better timing (IMHO), but with similar conclusions: Python 2.3 (#1, Sep 13 2003, 00:49:11) [GCC 3.3 20030304 (Apple Computer, Inc. build 1495)] on darwin Type "help", "copyright", "credits" or "license" for more information.
[Apologies for the linewraps.] I also get similar results with double arrays. Weird. Python 2.3 on Mac OS X 10.3.sumthin', latest CVS checkout of numarray, Numeric 23.1. -- Robert Kern rkern@ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter
![](https://secure.gravatar.com/avatar/faf9400121dca9940496a7473b1d8179.jpg?s=120&d=mm&r=g)
On Thu, 2004-05-20 at 05:26, Robert Kern wrote:
benchmark i numarray (usec) Numeric (usec) numarray:Numeric searchsorted(a,p/5) 0.0 446 12 36.7 1.0 438 11 36.8 2.0 450 12 35.0 3.0 459 14 32.6 4.0 511 25 19.7 5.0 636 56 11.2 6.0 653 64 10.1 searchsorted(a,a) 0.0 285 5 48.0 1.0 283 7 39.6 2.0 291 29 9.8 3.0 368 308 1.2 4.0 1771 4120 0.4 5.0 17335 52127 0.3 6.0 201277 605787 0.3 p = 10**i a = arange(p) The first case agrees with your results of ~10x difference in favor of Numeric at 10**6 elements. The last case shows a ~3x numarray advantage given large numbers of values. My analysis is this: since searchsorted runs in O(log2(N)) time, even with 10**6 elements there are only 20 iterations or so. This is just not enough to overcome numarray's Python ufunc overhead. I'll see what I can do, but I think we're up against the standard numarray performance wall for small arrays. Regards, Todd
![](https://secure.gravatar.com/avatar/81b3970c8247b2521d2f814de5b24475.jpg?s=120&d=mm&r=g)
A Dijous 20 Maig 2004 19:13, vareu escriure:
I see. What about creating a special case for when the item to be searched is an scalar (and not an array)? In such a case, it would be theoretically possible to get at least the bisect.bisect() performance. Besides, if this special case would be implemented, users would be able to call repeatedly this scalar version of searchsorted() in order to deal with very small "values" arrays (len()<5) and still having a gain. Just a thought, -- Francesc Alted
participants (3)
-
Francesc Alted
-
Robert Kern
-
Todd Miller