looping and searching in numpy array
Heli
hemla21 at gmail.com
Thu Mar 10 11:48:48 EST 2016
On Thursday, March 10, 2016 at 2:02:57 PM UTC+1, Peter Otten wrote:
> Heli wrote:
>
> > Dear all,
> >
> > I need to loop over a numpy array and then do the following search. The
> > following is taking almost 60(s) for an array (npArray1 and npArray2 in
> > the example below) with around 300K values.
> >
> >
> > for id in np.nditer(npArray1):
> >
> > newId=(np.where(npArray2==id))[0][0]
> >
> >
> > Is there anyway I can make the above faster? I need to run the script
> > above on much bigger arrays (50M). Please note that my two numpy arrays in
> > the lines above, npArray1 and npArray2 are not necessarily the same size,
> > but they are both 1d.
>
> You mean you are looking for the index of the first occurence in npArray2
> for every value of npArray1?
>
> I don't know how to do this in numpy (I'm not an expert), but even basic
> Python might be acceptable:
>
> lookup = {}
> for i, v in enumerate(npArray2):
> if v not in lookup:
> lookup[v] = i
>
> for v in npArray1:
> print(lookup.get(v, "<not found>"))
>
> That way you iterate once (in Python) instead of 2*len(npArray1) times (in
> C) over npArray2.
Dear Peter,
Thanks for your reply. This really helped. It reduces the script time from 61(s) to 2(s).
I am still very interested in knowing the correct numpy way to do this, but till then your fix works great.
Thanks a lot,
More information about the Python-list
mailing list