speeding up getting a subset of a data array
Hi No doubt asked many times before so apologies.... I'm pulling a subset array out of a data array where I have a list of the indices I want (could be an array rather than a list actually - I have it in both). Potentially the number of points and the number of times I do this can get very large so any saving in time is good. So, paraphrasing what I've currently got.... say I have... subsetpointerlist=[0,1,2,5,8,15,25...] subsetsize=len(subsetpointerlist) subsetarray=zeros(subsetsize,dtype=float) for index,pos in enumerate(subsetpointerlist): subsetarray[index]=dataarray[pos] How do I speed this up in numpy, i.e. by removing the for loop? Do I set up some sort of a subsetpointerarray as a mask and then somehow apply that to dataarray to get the values into subsetarray? Thanks Brennan
On Mon, Aug 10, 2009 at 8:52 PM, Brennan Williams<brennan.williams@visualreservoir.com> wrote:
Hi
No doubt asked many times before so apologies....
I'm pulling a subset array out of a data array where I have a list of the indices I want (could be an array rather than a list actually - I have it in both).
Potentially the number of points and the number of times I do this can get very large so any saving in time is good.
So, paraphrasing what I've currently got.... say I have...
subsetpointerlist=[0,1,2,5,8,15,25...] subsetsize=len(subsetpointerlist) subsetarray=zeros(subsetsize,dtype=float) for index,pos in enumerate(subsetpointerlist): subsetarray[index]=dataarray[pos]
How do I speed this up in numpy, i.e. by removing the for loop?
Do I set up some sort of a subsetpointerarray as a mask and then somehow apply that to dataarray to get the values into subsetarray?
Thanks
Brennan
looks to me like subsetarray = dataarray[subsetpointerlist] or with type conversion subsetarray = dataarray[subsetpointerlist].astype(float) Josef
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
josef.pktd@gmail.com wrote:
On Mon, Aug 10, 2009 at 8:52 PM, Brennan Williams<brennan.williams@visualreservoir.com> wrote:
Hi
No doubt asked many times before so apologies....
I'm pulling a subset array out of a data array where I have a list of the indices I want (could be an array rather than a list actually - I have it in both).
Potentially the number of points and the number of times I do this can get very large so any saving in time is good.
So, paraphrasing what I've currently got.... say I have...
subsetpointerlist=[0,1,2,5,8,15,25...] subsetsize=len(subsetpointerlist) subsetarray=zeros(subsetsize,dtype=float) for index,pos in enumerate(subsetpointerlist): subsetarray[index]=dataarray[pos]
How do I speed this up in numpy, i.e. by removing the for loop?
Do I set up some sort of a subsetpointerarray as a mask and then somehow apply that to dataarray to get the values into subsetarray?
Thanks
Brennan
looks to me like
subsetarray = dataarray[subsetpointerlist]
or with type conversion
subsetarray = dataarray[subsetpointerlist].astype(float)
Josef
Thanks, with a little bit of googling/rtfm I'm getting there. Think I overdid my thinking on mask based on something else that Robert Kern helped me out with.
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Brennan Williams wrote:
Hi
No doubt asked many times before so apologies....
I'm pulling a subset array out of a data array where I have a list of the indices I want (could be an array rather than a list actually - I have it in both).
Potentially the number of points and the number of times I do this can get very large so any saving in time is good.
So, paraphrasing what I've currently got.... say I have...
subsetpointerlist=[0,1,2,5,8,15,25...] subsetsize=len(subsetpointerlist) subsetarray=zeros(subsetsize,dtype=float) for index,pos in enumerate(subsetpointerlist): subsetarray[index]=dataarray[pos]
How do I speed this up in numpy, i.e. by removing the for loop?
It's not as simple as... subsetarray=dataarray[subsetpointerarray] is it?
Do I set up some sort of a subsetpointerarray as a mask and then somehow apply that to dataarray to get the values into subsetarray?
Thanks
Brennan
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
participants (2)
-
Brennan Williams
-
josef.pktd@gmail.com