numpy where function on different sized arrays
Hi all This must have been answered in the past but my google search capabilities are not the best. Given an array A say of dimension 40x60 and given another array/vector B of dimension 20 (the values in B occur only once). What I would like to do is the following which of course does not work (by the way doesn't work in IDL either): indx=where(A == B) I understand A and B are both of different dimensions. So my question: what would the fastest or proper way to accomplish this (I found a solution but think is rather awkward and not very scipy/numpy-tonic tough). Thanks -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
M = A[..., np.newaxis] == B
will give you a 40x60x20 boolean 3d-array where M[..., i] gives you a
boolean mask for all the occurrences of B[i] in A.
If you wanted all the (i, j) pairs for each value in B, you could do
something like
import numpy as np
from itertools import izip, groupby
from operator import itemgetter
id1, id2, id3 = np.where(A[..., np.newaxis] == B)
order = np.argsort(id3)
triples_iter = izip(id3[order], id1[order], id2[order])
grouped = groupby(triples_iter, itemgetter(0))
d = dict((b_value, [idx[1:] for idx in indices]) for b_value, indices in
grouped)
Then d[value] is a list of all the (i, j) pairs where A[i, j] == value, and
the keys of d are every value in B.
On Sat, Nov 24, 2012 at 3:36 PM, Siegfried Gonzi
Hi all
This must have been answered in the past but my google search capabilities are not the best.
Given an array A say of dimension 40x60 and given another array/vector B of dimension 20 (the values in B occur only once).
What I would like to do is the following which of course does not work (by the way doesn't work in IDL either):
indx=where(A == B)
I understand A and B are both of different dimensions. So my question: what would the fastest or proper way to accomplish this (I found a solution but think is rather awkward and not very scipy/numpy-tonic tough).
Thanks -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
A pure Python approach could be:
for i, x in enumerate(a):
for j, y in enumerate(x):
if y in b:
idx.append((i,j))
Of course, it is slow if the arrays are large, but it is very
readable, and probably very fast if cythonised.
David.
On Sat, Nov 24, 2012 at 10:19 PM, David Warde-Farley
M = A[..., np.newaxis] == B
will give you a 40x60x20 boolean 3d-array where M[..., i] gives you a boolean mask for all the occurrences of B[i] in A.
If you wanted all the (i, j) pairs for each value in B, you could do something like
import numpy as np from itertools import izip, groupby from operator import itemgetter
id1, id2, id3 = np.where(A[..., np.newaxis] == B) order = np.argsort(id3) triples_iter = izip(id3[order], id1[order], id2[order]) grouped = groupby(triples_iter, itemgetter(0)) d = dict((b_value, [idx[1:] for idx in indices]) for b_value, indices in grouped)
Then d[value] is a list of all the (i, j) pairs where A[i, j] == value, and the keys of d are every value in B.
On Sat, Nov 24, 2012 at 3:36 PM, Siegfried Gonzi
wrote: Hi all
This must have been answered in the past but my google search capabilities are not the best.
Given an array A say of dimension 40x60 and given another array/vector B of dimension 20 (the values in B occur only once).
What I would like to do is the following which of course does not work (by the way doesn't work in IDL either):
indx=where(A == B)
I understand A and B are both of different dimensions. So my question: what would the fastest or proper way to accomplish this (I found a solution but think is rather awkward and not very scipy/numpy-tonic tough).
Thanks -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
I think that would lose information as to which value in B was at each
position. I think you want:
On Sat, Nov 24, 2012 at 5:23 PM, Daπid
A pure Python approach could be:
for i, x in enumerate(a): for j, y in enumerate(x): if y in b: idx.append((i,j))
Of course, it is slow if the arrays are large, but it is very readable, and probably very fast if cythonised.
David.
On Sat, Nov 24, 2012 at 10:19 PM, David Warde-Farley
wrote: M = A[..., np.newaxis] == B
will give you a 40x60x20 boolean 3d-array where M[..., i] gives you a boolean mask for all the occurrences of B[i] in A.
If you wanted all the (i, j) pairs for each value in B, you could do something like
import numpy as np from itertools import izip, groupby from operator import itemgetter
id1, id2, id3 = np.where(A[..., np.newaxis] == B) order = np.argsort(id3) triples_iter = izip(id3[order], id1[order], id2[order]) grouped = groupby(triples_iter, itemgetter(0)) d = dict((b_value, [idx[1:] for idx in indices]) for b_value, indices in grouped)
Then d[value] is a list of all the (i, j) pairs where A[i, j] == value, and the keys of d are every value in B.
On Sat, Nov 24, 2012 at 3:36 PM, Siegfried Gonzi < sgonzi@staffmail.ed.ac.uk> wrote:
Hi all
This must have been answered in the past but my google search
capabilities
are not the best.
Given an array A say of dimension 40x60 and given another array/vector B of dimension 20 (the values in B occur only once).
What I would like to do is the following which of course does not work (by the way doesn't work in IDL either):
indx=where(A == B)
I understand A and B are both of different dimensions. So my question: what would the fastest or proper way to accomplish this (I found a solution but think is rather awkward and not very scipy/numpy-tonic tough).
Thanks -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Sat, Nov 24, 2012 at 7:08 PM, David Warde-Farley < d.warde.farley@gmail.com> wrote:
I think that would lose information as to which value in B was at each position. I think you want:
(premature send, stupid Gmail...) idx = {} for i, x in enumerate(a): for j, y in enumerate(x): if y in B: idx.setdefault(y, []).append((i,j)) On the problem size the OP specified, this is about 4x slower than the NumPy version I posted above. However with a small modification: idx = {} set_b = set(B) # makes 'if y in B' lookups much faster for i, x in enumerate(a): for j, y in enumerate(x): if y in set_b: idx.setdefault(y, []).append((i,j)) It actually beats my solution. With inputs: np.random.seed(0); A = np.random.random_integers(40, 59, size=(40, 60)); B = np.arange(40, 60) In [115]: timeit foo_py_orig(A, B) 100 loops, best of 3: 16.5 ms per loop In [116]: timeit foo_py(A, B) 100 loops, best of 3: 2.5 ms per loop In [117]: timeit foo_numpy(A, B) 100 loops, best of 3: 4.15 ms per loop Depending on the specifics of the inputs, a collections.DefaultDict could also help things.
On Sat, Nov 24, 2012 at 5:23 PM, Daπid
wrote: A pure Python approach could be:
for i, x in enumerate(a): for j, y in enumerate(x): if y in b: idx.append((i,j))
Of course, it is slow if the arrays are large, but it is very readable, and probably very fast if cythonised.
David.
On Sat, Nov 24, 2012 at 10:19 PM, David Warde-Farley
wrote: M = A[..., np.newaxis] == B
will give you a 40x60x20 boolean 3d-array where M[..., i] gives you a boolean mask for all the occurrences of B[i] in A.
If you wanted all the (i, j) pairs for each value in B, you could do something like
import numpy as np from itertools import izip, groupby from operator import itemgetter
id1, id2, id3 = np.where(A[..., np.newaxis] == B) order = np.argsort(id3) triples_iter = izip(id3[order], id1[order], id2[order]) grouped = groupby(triples_iter, itemgetter(0)) d = dict((b_value, [idx[1:] for idx in indices]) for b_value, indices in grouped)
Then d[value] is a list of all the (i, j) pairs where A[i, j] == value, and the keys of d are every value in B.
On Sat, Nov 24, 2012 at 3:36 PM, Siegfried Gonzi < sgonzi@staffmail.ed.ac.uk> wrote:
Hi all
This must have been answered in the past but my google search
capabilities
are not the best.
Given an array A say of dimension 40x60 and given another array/vector B of dimension 20 (the values in B occur only once).
What I would like to do is the following which of course does not work (by the way doesn't work in IDL either):
indx=where(A == B)
I understand A and B are both of different dimensions. So my question: what would the fastest or proper way to accomplish this (I found a solution but think is rather awkward and not very scipy/numpy-tonic tough).
Thanks -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
participants (3)
-
David Warde-Farley
-
Daπid
-
Siegfried Gonzi