Re: [Numpy-discussion] Complex slicing and take
Hi thanks for the tips. Unfortunately this is not what I am after.
? import numpy as num ? startarray = random((1000,100)) ? take_sample = [1,2,5,6,1,2] ? temp = num.take(startarray,take_sample,axis=1) Would it help to make temp a 1000x4 array instead of 1000x6? Could you do that by changing take_sample to [1,2,5,6] and multiplying columns 1 and 2 by a factor of 2? That would slow down the construction of temp but speed up the addition (and slicing?) in the loop below.
No it wouldn't help unfortunately, because the second instance of "1,2" would have different shifts. So I cannot just count the number of occurrence of each line. From the initial 2D array, 1D lines could be extracted several times, with each time a different shift.
? shift = [10,20,34,-10,22,-20] ? result = num.zeros(900) ?# shorter than initial because of the shift ? for i in range(len(shift)) : ? ? ?result += temp[100+shift[i]:-100+shift[1]] This looks fast to me. The slicing doesn't make a copy nor does the addition. I've read that cython does fast indexing but I don't know if that applies to slicing as well. I assume that shift[1] is a typo and should be shift[i].
(yes of course the shift[1] should be shift[i]) Well this may be fast, but not fast enough. And also, starting from my 2D startarray again, it looks odd that I cannot do something like: startarray = random((1000,100)) take_sample = [1,2,5,6,1,2] shift = [10,20,34,-10,22,-20] result = num.sum(num.take(startarray,take_sample,axis=1)[100+shift:100-shift]) but of course this is nonsense because I cannot address the data this way (with "shift"). In fact I realise now that my question is simpler: how do I extract and sum 1d lines from a 2D array if I want first each line to be "shifted". So starting again now, I want a quick way to write: startarray = random((1000,6)) shift = [10,20,34,-10,22,-20] result = num.zeros(1000, dtype=float) for i in len(shift) : result += startarray[100+shift[i]:900+shift[i]] Can I write this directly with some numpy indexing without the loop in python? thanks for any tip. Eric
On Wed, Dec 30, 2009 at 12:08 AM, Eric Emsellem <eemselle@eso.org> wrote:
Hi
thanks for the tips. Unfortunately this is not what I am after.
? import numpy as num ? startarray = random((1000,100)) ? take_sample = [1,2,5,6,1,2] ? temp = num.take(startarray,take_sample,axis=1)
Would it help to make temp a 1000x4 array instead of 1000x6? Could you do that by changing take_sample to [1,2,5,6] and multiplying columns 1 and 2 by a factor of 2? That would slow down the construction of temp but speed up the addition (and slicing?) in the loop below.
No it wouldn't help unfortunately, because the second instance of "1,2" would have different shifts. So I cannot just count the number of occurrence of each line.
From the initial 2D array, 1D lines could be extracted several times, with each time a different shift.
? shift = [10,20,34,-10,22,-20] ? result = num.zeros(900) ?# shorter than initial because of the shift ? for i in range(len(shift)) : ? ? ?result += temp[100+shift[i]:-100+shift[1]]
This looks fast to me. The slicing doesn't make a copy nor does the addition. I've read that cython does fast indexing but I don't know if that applies to slicing as well. I assume that shift[1] is a typo and should be shift[i].
(yes of course the shift[1] should be shift[i]) Well this may be fast, but not fast enough. And also, starting from my 2D startarray again, it looks odd that I cannot do something like:
startarray = random((1000,100)) take_sample = [1,2,5,6,1,2] shift = [10,20,34,-10,22,-20] result = num.sum(num.take(startarray,take_sample,axis=1)[100+shift:100-shift])
but of course this is nonsense because I cannot address the data this way (with "shift").
In fact I realise now that my question is simpler: how do I extract and sum 1d lines from a 2D array if I want first each line to be "shifted". So starting again now, I want a quick way to write:
startarray = random((1000,6)) shift = [10,20,34,-10,22,-20] result = num.zeros(1000, dtype=float) for i in len(shift) : result += startarray[100+shift[i]:900+shift[i]]
Can I write this directly with some numpy indexing without the loop in python?
thanks for any tip.
Eric
Where's the bottleneck? There's the loop, there's constructing the indices (which could be done outside the loop), slicing, adding. The location of the bottleneck probably depends on the relative sizes of the arrays. If the bottleneck is the loop, i.e. shift has a LOT of elements, then it might speed things up to break shift into chunks and use python's multiprocessing module to solve this in parallel. Something like cython would also speed up the loop. I haven't tried running your code, but if anyone does, I think result += startarray[100+shift[i]:900+shift[i]] should be result += startarray[100+shift[i]:900+shift[i], i]
On Wed, Dec 30, 2009 at 12:19 PM, Keith Goodman <kwgoodman@gmail.com> wrote:
On Wed, Dec 30, 2009 at 12:08 AM, Eric Emsellem <eemselle@eso.org> wrote:
Hi
thanks for the tips. Unfortunately this is not what I am after.
? import numpy as num ? startarray = random((1000,100)) ? take_sample = [1,2,5,6,1,2] ? temp = num.take(startarray,take_sample,axis=1)
Would it help to make temp a 1000x4 array instead of 1000x6? Could you do that by changing take_sample to [1,2,5,6] and multiplying columns 1 and 2 by a factor of 2? That would slow down the construction of temp but speed up the addition (and slicing?) in the loop below.
No it wouldn't help unfortunately, because the second instance of "1,2" would have different shifts. So I cannot just count the number of occurrence of each line.
From the initial 2D array, 1D lines could be extracted several times, with each time a different shift.
? shift = [10,20,34,-10,22,-20] ? result = num.zeros(900) ?# shorter than initial because of the shift ? for i in range(len(shift)) : ? ? ?result += temp[100+shift[i]:-100+shift[1]]
This looks fast to me. The slicing doesn't make a copy nor does the addition. I've read that cython does fast indexing but I don't know if that applies to slicing as well. I assume that shift[1] is a typo and should be shift[i].
(yes of course the shift[1] should be shift[i]) Well this may be fast, but not fast enough. And also, starting from my 2D startarray again, it looks odd that I cannot do something like:
startarray = random((1000,100)) take_sample = [1,2,5,6,1,2] shift = [10,20,34,-10,22,-20] result = num.sum(num.take(startarray,take_sample,axis=1)[100+shift:100-shift])
but of course this is nonsense because I cannot address the data this way (with "shift").
In fact I realise now that my question is simpler: how do I extract and sum 1d lines from a 2D array if I want first each line to be "shifted". So starting again now, I want a quick way to write:
startarray = random((1000,6)) shift = [10,20,34,-10,22,-20] result = num.zeros(1000, dtype=float) for i in len(shift) : result += startarray[100+shift[i]:900+shift[i]]
Can I write this directly with some numpy indexing without the loop in python?
thanks for any tip.
Eric
Where's the bottleneck? There's the loop, there's constructing the indices (which could be done outside the loop), slicing, adding. The location of the bottleneck probably depends on the relative sizes of the arrays. If the bottleneck is the loop, i.e. shift has a LOT of elements, then it might speed things up to break shift into chunks and use python's multiprocessing module to solve this in parallel. Something like cython would also speed up the loop.
I haven't tried running your code, but if anyone does, I think
result += startarray[100+shift[i]:900+shift[i]]
should be
result += startarray[100+shift[i]:900+shift[i], i] _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
something like this ? just trying out, I haven't really checked carefully whether it actually replicates your snippets Constructing big intermediate arrays, might not improve performance compared to a loop
np.arange(30).reshape(6,5) array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24], [25, 26, 27, 28, 29]])
np.arange(30).reshape(6,5)[np.array([[1,2,2,1]]).T,np.arange(0,3)+np.array([[0,1,2,1]]).T] array([[ 5, 6, 7], [11, 12, 13], [12, 13, 14], [ 6, 7, 8]])
np.arange(30).reshape(6,5)[np.array([[1,2,2,1]]).T,np.arange(0,3)+np.array([[0,1,2,1]]).T].sum(0) array([34, 38, 42])
Josef
participants (3)
-
Eric Emsellem
-
josef.pktd@gmail.com
-
Keith Goodman