Indexing in Numpy vs. IDL?
hi all, I'm fairly new to Numpy and I've been trying to port over some IDL code to become more familiar. I've been moderately successful with numpy.where and numpy.compress to do some of things that were pretty easy to do in IDL. I'm a bit confused about how the indexing of arrays works though. This is pretty straightforward: in IDL ============================= data = [50.00, 100.00, 150.00, 200.00, 250.00, 300.00, 350.00] index = WHERE((data GT 100.00) AND (data LT 300.00)) new_data = data[index] print, new_data 150.000 200.000 250.000 in Python ==============================
import numpy from numpy import * data = [50.00, 100.00, 150.00, 200.00, 250.00, 300.00, 350.00] data = array(data, dtype=float32) #Convert list to array index_mask = numpy.where((data > 100.00) & (data < 300.00), 1,0) #Test for the condition. index_one = numpy.compress(index_mask, data) print index_one [ 150. 200. 250.]
But I'm having a bit of trouble with the Python equivalent of this: in IDL: ============================= index_two = WHERE ((data[index_one] GT bottom) AND (data[index_one] LE top) and also this: result = MAX(data[index_one[index_two]]) From what I've read it looks like numpy.take() might work to do the indexing. I've tried to test this but I'm not getting the answers I'd expect. Am I overlooking something obvious here? Thanks in advance for any responses.
On Sun, Nov 16, 2008 at 16:15, Jason Woolard
hi all,
I'm fairly new to Numpy and I've been trying to port over some IDL code to become more familiar. I've been moderately successful with numpy.where and numpy.compress to do some of things that were pretty easy to do in IDL. I'm a bit confused about how the indexing of arrays works though.
You may want to look at this section of the reference manual: http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
This is pretty straightforward:
in IDL ============================= data = [50.00, 100.00, 150.00, 200.00, 250.00, 300.00, 350.00] index = WHERE((data GT 100.00) AND (data LT 300.00)) new_data = data[index] print, new_data
150.000 200.000 250.000
in Python ==============================
import numpy from numpy import * data = [50.00, 100.00, 150.00, 200.00, 250.00, 300.00, 350.00] data = array(data, dtype=float32) #Convert list to array index_mask = numpy.where((data > 100.00) & (data < 300.00), 1,0)
index_mask = (data > 100.) & (data < 300.0)
#Test for the condition.
index_one = numpy.compress(index_mask, data)
index_one = data[index_mask]
print index_one [ 150. 200. 250.]
Note that this is not an index; this is equivalent to new_data in your IDL example. Are you sure you wanted to call this index_one?
But I'm having a bit of trouble with the Python equivalent of this:
in IDL: ============================= index_two = WHERE ((data[index_one] GT bottom) AND (data[index_one] LE top)
Well, since index_one are not indices, then data[index_one] doesn't mean anything. Can you show us the equivalent IDL that generates index_one?
and also this:
result = MAX(data[index_one[index_two]])
Ditto.
From what I've read it looks like numpy.take() might work to do the indexing. I've tried to test this but I'm not getting the answers I'd expect. Am I overlooking something obvious here?
So-called "advanced indexing" where the indices are boolean or integer arrays will probably solve your problem, but we need more information on what you mean by index_one. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
On Sun, Nov 16, 2008 at 17:30, Robert Kern
So-called "advanced indexing" where the indices are boolean or integer arrays will probably solve your problem, but we need more information on what you mean by index_one.
Sorry, I left out a sentence or two here. I also meant to say that take() and compress() are legacy functions mostly subsumed by advanced indexing. Also, instead of doing things like where(some_boolean_array, 1, 0), just use some_boolean_array. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Jason Woolard wrote:
hi all,
I'm fairly new to Numpy and I've been trying to port over some IDL code to become more familiar. I've been moderately successful with numpy.where and numpy.compress to do some of things that were pretty easy to do in IDL. I'm a bit confused about how the indexing of arrays works though.
This is pretty straightforward:
in IDL ============================= data = [50.00, 100.00, 150.00, 200.00, 250.00, 300.00, 350.00] index = WHERE((data GT 100.00) AND (data LT 300.00)) new_data = data[index] print, new_data
150.000 200.000 250.000
in Python ==============================
import numpy from numpy import * data = [50.00, 100.00, 150.00, 200.00, 250.00, 300.00, 350.00] data = array(data, dtype=float32) #Convert list to array index_mask = numpy.where((data > 100.00) & (data < 300.00), 1,0) #Test for the condition. index_one = numpy.compress(index_mask, data) print index_one [ 150. 200. 250.]
But I'm having a bit of trouble with the Python equivalent of this:
in IDL: ============================= index_two = WHERE ((data[index_one] GT bottom) AND (data[index_one] LE top)
and also this:
result = MAX(data[index_one[index_two]])
From what I've read it looks like numpy.take() might work to do the indexing. I've tried to test this but I'm not getting the answers I'd expect. Am I overlooking something obvious here?
Thanks in advance for any responses. Hi Jason, I too am a former IDLer. There is a slight paradigm shift here. In IDL you can index an array with another array of integer indices, and you can do that too in numpy. But numpy also lets you index an array with an array of booleans. So data > 100. creates an array of booleans the same size and shape as data, so you can write your new array as data[ (data > 100.) & (data < 300.) ] Note we don't use a "where" function. In numpy, "where" is a completely different thing than in IDL. If you really wanted to generate a list of indices, you can use the "nonzero" method, but the numpy book says this isn't as fast as boolean indexing.
HTH, Paul Probert
participants (3)
-
Jason Woolard
-
Paul Probert
-
Robert Kern