Hello, Could you please give me some hints about how to mask an array using another arrays like in the following example. In [14]: a = arange(5) In [15]: a Out[15]: array([0, 1, 2, 3, 4]) and my secondary array is "b" In [16]: b = array([2,3]) What I want to do is to mask a with b values and get an array of: array([False, False, True, True, False], dtype=bool) That is just an manually created array. I still don't know how to do this programmatically in Pythonic fashion or numpy's masked array functions. Thank you. Gökhan
Yes Pierre, I like this one line of elegances in Python a lot. I was thinking that the answer lies in somewhere in masked array operations, but I proved wrong. Thanks for your input on this small riddle. Here is another way of doing that. (That's what I thought of initially and what Matthias Michler responded on matplotlib mailing list.) mask = zeros(len(a), dtype=bool) for index in xrange(len(a)): # run through array a if a[index] in b: mask[index] = True Ending with a quote about Pythonicness :) "...that something is Pythonic when it has a sense of quality, simplicity, clarity and elegance about it." Gökhan On Wed, Apr 22, 2009 at 4:49 PM, Pierre GM <pgmdevlist@gmail.com> wrote:
On Apr 22, 2009, at 5:21 PM, Gökhan SEVER wrote:
Hello,
Could you please give me some hints about how to mask an array using another arrays like in the following example.
What about that ? numpy.logical_or.reduce([a==i for i in b])
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Wed, Apr 22, 2009 at 8:18 PM, Gökhan SEVER <gokhansever@gmail.com> wrote:
Yes Pierre,
I like this one line of elegances in Python a lot. I was thinking that the answer lies in somewhere in masked array operations, but I proved wrong.
Thanks for your input on this small riddle.
Here is another way of doing that. (That's what I thought of initially and what Matthias Michler responded on matplotlib mailing list.)
mask = zeros(len(a), dtype=bool) for index in xrange(len(a)): # run through array a if a[index] in b: mask[index] = True
Ending with a quote about Pythonicness :)
"...that something is Pythonic when it has a sense of quality, simplicity, clarity and elegance about it."
Gökhan
On Wed, Apr 22, 2009 at 4:49 PM, Pierre GM <pgmdevlist@gmail.com> wrote:
On Apr 22, 2009, at 5:21 PM, Gökhan SEVER wrote:
Hello,
Could you please give me some hints about how to mask an array using another arrays like in the following example.
What about that ? numpy.logical_or.reduce([a==i for i in b])
I prefer broad casting to list comprehension in numpy:
a = np.arange(5) b = np.array([2,3])
(a[:,np.newaxis]==b).any(1) array([False, False, True, True, False], dtype=bool)
Josef
On Wed, Apr 22, 2009 at 10:45 PM, Pierre GM <pgmdevlist@gmail.com> wrote:
On Apr 22, 2009, at 9:03 PM, josef.pktd@gmail.com wrote:
I prefer broad casting to list comprehension in numpy:
Pretty neat! I still dont have the broadcasting reflex. Now, any idea which one is more efficient in terms of speed? in terms of temporaries?
I used similar broadcasting for working with categorical data series and for creating dummy variables for regression. So I played already for some time with this. In this case, I would except that the memory consumption is essentially the same, you have a list of arrays and I have a 2d array, unless numpy needs an additional conversion to array in np.logical_or.reduce, which seems plausible but I don't know. The main point that Sturla convinced me in the discussion on kendalltau is that if b is large, 500 or 1000, then building the full intermediate boolean array is killing both memory and speed performance, compared to a python for loop, and very bad compared to a cython loop. In this example my version is at least twice as fast for len(b) = 4, your version does not scale very well at all to larger b, your takes 7 times as long as mine for len(b) = 400, which, I guess would mean that you have an extra copying step I added the for loop and it is always the fastest, even more for short b. I hope it's correct, I never used a inplace logical operator. Josef from time import time as time_ a = np.array(range(10)*1000) blen = 10#100 b = np.array([2,3,5,8]*blen) print "shape b", b.shape t = time_() for _ in range(100): (a[:,np.newaxis]==b).any(1) print time_() - t t = time_() for _ in range(100): np.logical_or.reduce([a==i for i in b]) print time_() - t t = time_() for _ in range(100): z = a == b[0] for ii in range(1,len(b)): z |= (a == b[ii]) print time_() - t #shape b (80,) #0.110000133514 #0.266000032425 #shape b (80,) #0.827999830246 #5.2650001049 #shape b (400,) #4.60899996758 #28.4370000362 #shape b (400,) #3.89100003242 #27.5 #shape b (400,) #3.89099979401 #27.3289999962 #3.51599979401 #for loop #shape b (40,) #0.453999996185 #2.54600000381 #0.359999895096 #for loop #shape b (4,) #0.108999967575 #0.28200006485 #0.0309998989105 #for loop
On Wed, Apr 22, 2009 at 04:21:05PM -0500, Gökhan SEVER wrote:
Could you please give me some hints about how to mask an array using another arrays like in the following example.
In [14]: a = arange(5)
In [15]: a Out[15]: array([0, 1, 2, 3, 4])
and my secondary array is "b"
In [16]: b = array([2,3])
What I want to do is to mask a with b values and get an array of:
array([False, False, True, True, False], dtype=bool)
This is an operation on 'sets': you are testing if members of a are 'in' b. Generally, set operations on arrays can be found in numpy.lib.arraysetops. I believe what you are interested in is setmember1d. HTH, Gaël
Ahaa,, Thanks Gaël. That method is more elegance than the previous inputs, and the simplest of all. Although one line of "import this" says: There should be one-- and preferably only one --obvious way to do it. I always find many different ways of implementing ideas in Python world. Gökhan On Thu, Apr 23, 2009 at 12:16 AM, Gael Varoquaux < gael.varoquaux@normalesup.org> wrote:
On Wed, Apr 22, 2009 at 04:21:05PM -0500, Gökhan SEVER wrote:
Could you please give me some hints about how to mask an array using another arrays like in the following example.
In [14]: a = arange(5)
In [15]: a Out[15]: array([0, 1, 2, 3, 4])
and my secondary array is "b"
In [16]: b = array([2,3])
What I want to do is to mask a with b values and get an array of:
array([False, False, True, True, False], dtype=bool)
This is an operation on 'sets': you are testing if members of a are 'in' b. Generally, set operations on arrays can be found in numpy.lib.arraysetops. I believe what you are interested in is setmember1d.
HTH,
Gaël _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Thu, Apr 23, 2009 at 1:24 AM, Gökhan SEVER <gokhansever@gmail.com> wrote:
Ahaa,,
Thanks Gaël. That method is more elegance than the previous inputs, and the simplest of all.
Although one line of "import this" says:
There should be one-- and preferably only one --obvious way to do it.
I always find many different ways of implementing ideas in Python world.
Gökhan
On Thu, Apr 23, 2009 at 12:16 AM, Gael Varoquaux <gael.varoquaux@normalesup.org> wrote:
On Wed, Apr 22, 2009 at 04:21:05PM -0500, Gökhan SEVER wrote:
Could you please give me some hints about how to mask an array using another arrays like in the following example.
In [14]: a = arange(5)
In [15]: a Out[15]: array([0, 1, 2, 3, 4])
and my secondary array is "b"
In [16]: b = array([2,3])
What I want to do is to mask a with b values and get an array of:
array([False, False, True, True, False], dtype=bool)
This is an operation on 'sets': you are testing if members of a are 'in' b. Generally, set operations on arrays can be found in numpy.lib.arraysetops. I believe what you are interested in is setmember1d.
HTH,
Gaël
setmember1d is very fast compared to the other solutions for large b. However, setmember1d requires that both arrays only have unique elements. So it doesn't work if, for example, your first array is a data vector with member ship in different groups (therefore not only uniques), and the second array is the sub group that you want to check. Josef
a = np.array([7, 5, 1, 3, 3, 0, 0, 8, 8, 2]) b = np.array([0, 1]) np.setmember1d(a,b) array([False, False, True, True, False, True, True, True, False, False], dtype=bool)
(a[:,np.newaxis]==b).any(1) array([False, False, True, False, False, True, True, False, False, False], dtype=bool)
<josef.pktd <at> gmail.com> writes:
setmember1d is very fast compared to the other solutions for large b.
However, setmember1d requires that both arrays only have unique elements.
So it doesn't work if, for example, your first array is a data vector with member ship in different groups (therefore not only uniques), and the second array is the sub group that you want to check.
Note there's a patch waiting to be reviewed that adds another version of setmember_1d for non-unique inputs. http://projects.scipy.org/numpy/ticket/1036 Neil
I suspect I am trying to do something similar... I would like to create a mask where I have data. In essence, I need to return True where x,y is equal to lon,lat.... I suppose a setmember solution may somehow be more elegant, but this is what I've worked up for now... suggestions? def genDataMask(x,y, xbounds=(-180,180), ybounds=(-90,90), res=(0.5,0.5) ): """ generate a data mask no data = False data = True """ xy = numpy.column_stack((x,y)) newx = np.arange(xbounds[0],xbound[1],res[0]) newy = np.arange(ybounds[0],ybounds[1],res[1]) #create datamask dm = np.empty(len(newx),len(newy)) dm.fill(np.nan) for _xy in xy: dm[np.where(_xy[0]=newx),np.where(_xy[1]==newy) ] = True -- View this message in context: http://www.nabble.com/Masking-an-array-with-another-array-tp23185887p2494341... Sent from the Numpy-discussion mailing list archive at Nabble.com.
participants (6)
-
Gael Varoquaux
-
Gökhan SEVER
-
John [H2O]
-
josef.pktd@gmail.com
-
Neil Crighton
-
Pierre GM