[Numpy-discussion] Keyword added to searchsorted.
Charles R Harris
charlesr.harris at gmail.com
Sat Sep 2 22:26:42 EDT 2006
Hi all,
I added the keyword side to the searchsorted method and functions.
Documentation:
"""a.searchsorted(values=v, side='left') -> array of indices.
Required Arguments:
v -- keys to be searched for in a.
Keyword arguments
side -- {'left', 'right'}, default('left').
If a is a 1-D array in ascending order, then
a.searchsorted(v, side='left')
returns an array of indices i such that for each element of values the
following holds:
a[j] < key <= a[i] for all j < i,
If such an index does not exist, a.size() is used. The result is such
that
if the key were to be inserted in the slot before the index i, then the
order of a would be preserved and i would be the smallest index with
that
property.
If a is a 1-D array in ascending order, then
a.searchsorted(v, side='right')
returns an array of indices i such that for each element of values the
following holds:
a[j] <= key < a[i] for all j < i,
If such an index does not exist, a.size() is used. The result is that if
the
key were to be inserted in the slot before the index i, then the order
of a
would be preserved and i would be the largest index with that property.
"""
I also replaced the old search routine as it was linear time worst case:
In [27]: t1 = t.Timer('N.searchsorted(a,1,side="left")','import numpy as N;
a = N.ones(1000000)')
In [28]: t2 = t.Timer('N.searchsorted(a,1,side="right")','import numpy as N;
a = N.ones(1000000)')
In [29]: t1.repeat(3,100)
Out[29]: [0.5301368236541748, 0.4924161434173584, 0.46317386627197266]
In [30]: t2.repeat(3,100)
Out[30]: [0.0011379718780517578, 0.00081586837768554688,
0.00083994865417480469]
where left was the original routine. It is now noticeably faster in some
situations:
In [2]: t1 = t.Timer('N.searchsorted(a,1,side="left")','import numpy as N; a
= N.ones(1000000)')
In [3]: t1.repeat(3,100)
Out[3]: [0.00082802772521972656, 0.00077795982360839844,
0.00076913833618164062]
I am open to suggestions as to better names for the keyword kind. It also
might be worth it to make type-specific routines if anyone commonly uses
large lists of keys and large sorted arrays.
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060902/19673625/attachment.html>
More information about the NumPy-Discussion
mailing list