[Numpy-discussion] Get the index of a comparison of two lists
Angus McMorland
amcmorl at gmail.com
Fri Feb 11 09:48:06 EST 2011
On 11 February 2011 09:01, FRENK Andreas <Andreas.FRENK at 3ds.com> wrote:
> Hi,
>
> I need to create a construct that returns the index of entries of the first
> list, if values in the first and second list are equal.
>
> Take
>
> valA = [1,2,3,4,20,21,22,23,24]
> valB = [1,2,3,4, 5,21,22,23]
> The correct solution is: [0,1,2,3,5,6,7]
>
> A potential loop can be:
> takeList=[]
> for j,a in enumerate(valA):
> if a in valB:
> takeList.append(j)
>
> Please note, valA can have entries like [1,10000000,1000000001,…..], i.e. it
> can be very sparse.
> I also thought about using bincount, but due to the sparse nature the return
> values from bincount would allocate too much memory.
>
> Any idea how to do it fast using numpy?
This probably isn't optimal yet, but seems to perform better than your
for loop for large array sizes, but is less good at very small sizes.
In [11]: def test(a, b):
....: takeList = []
....: for j, A in enumerate(a):
....: if A in b:
....: takeList.append(j)
....: return takeList
In [24]: a = np.random.randint(10, size=10)
In [25]: b = np.random.randint(10, size=10)
In [26]: %timeit test(a,b)
10000 loops, best of 3: 55.4 µs per loop
In [27]: %timeit np.arange(a.size)[np.lib.setmember1d(a,b)]
10000 loops, best of 3: 92.9 µs per loop
In [19]: a = np.random.randint(10000, size=10000)
In [20]: b = np.random.randint(10000, size=10000)
In [21]: %timeit np.arange(a.size)[np.lib.setmember1d(a,b)]
100 loops, best of 3: 7.99 ms per loop
In [22]: %timeit test(a,b)
10 loops, best of 3: 787 ms per loop
Hope that's useful,
Angus
--
AJC McMorland
Post-doctoral research fellow
Neurobiology, University of Pittsburgh
More information about the NumPy-Discussion
mailing list