Hi,<br><br><div class="gmail_quote">On Mon, Jun 4, 2012 at 12:44 AM, srean <span dir="ltr"><<a href="mailto:srean.list@gmail.com" target="_blank">srean.list@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Hi Wolfgang,<br>

<br>

  I think you are looking for reduceat( ), in particular add.reduceat()<br></blockquote><div> </div><div>Indeed OP could utilize <font face="courier new, monospace">add.reduceat(...)</font>, like:</div><p style="margin:0px">

<font face="courier new, monospace"># tst.py</font></p>

<p style="margin:0px"><font face="courier new, monospace">import numpy as np</font></p><p style="margin:0px"><font face="courier new, monospace"><br></font></p>

<p style="margin:0px"></p>

<p style="margin:0px"><font face="courier new, monospace">def reduce(data, lengths):</font></p>

<p style="margin:0px"><font face="courier new, monospace">    ind, ends= np.r_[lengths, lengths], lengths.cumsum()</font></p>

<p style="margin:0px"><font face="courier new, monospace">    ind[::2], ind[1::2]= ends- lengths, ends</font></p>

<p style="margin:0px"><font face="courier new, monospace">    return np.add.reduceat(np.r_[data, 0], ind)[::2]</font></p><p style="margin:0px"><font face="courier new, monospace"><br></font></p>

<p style="margin:0px"></p>

<p style="margin:0px"><font face="courier new, monospace">def normalize(data, lengths):</font></p>

<p style="margin:0px"><font face="courier new, monospace">    return data/ np.repeat(reduce(data, lengths), lengths)</font></p><p style="margin:0px"><font face="courier new, monospace"><br></font></p>

<p style="margin:0px"></p>

<p style="margin:0px"><font face="courier new, monospace">def gen(par):</font></p>

<p style="margin:0px"><font face="courier new, monospace">    lengths= np.random.randint(*par)</font></p>

<p style="margin:0px"><font face="courier new, monospace">    return np.random.randn(lengths.sum()), lengths</font></p><p style="margin:0px"><font face="courier new, monospace"><br></font></p>

<p style="margin:0px"></p>

<p style="margin:0px"><font face="courier new, monospace">if __name__ == '__main__':</font></p>

<p style="margin:0px"><font face="courier new, monospace">    data= np.array([1, 2, 1, 2, 3, 4, 1, 2, 3], dtype= float)</font></p>

<p style="margin:0px"><font face="courier new, monospace">    lengths= np.array([2, 4, 3])</font></p>

<p style="margin:0px"><font face="courier new, monospace">    print reduce(data, lengths)</font></p>

<p style="margin:0px"><font face="courier new, monospace">    print normalize(data, lengths).round(2)</font></p>

<p style="margin:0px"></p><div><font face="courier new, monospace"><br></font></div><div>Resulting:</div><div><div><font face="courier new, monospace">In []: %run tst</font></div><div><font face="courier new, monospace">[  3.  10.   6.]</font></div>

<div><font face="courier new, monospace">[ 0.33  0.67  0.1   0.2   0.3   0.4   0.17  0.33  0.5 ]</font></div></div><div><font face="courier new, monospace"><br></font></div><div>Fast enough:</div><div><div><font face="courier new, monospace">In []: data, lengths= gen([5, 15, 5e4])</font></div>

<div><div><div><font face="courier new, monospace">In []: data.size</font></div><div><font face="courier new, monospace">Out[]: 476028</font></div></div></div><div><font face="courier new, monospace">In []: %timeit normalize(data, lengths)</font></div>

<div><font face="courier new, monospace">10 loops, best of 3: 29.4 ms per loop</font></div><div><br></div></div><div><br></div><div>My 2 cents,</div><div>-eat</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<span class="HOEnZb"><font color="#888888"><br>

-- srean<br>

</font></span><div class="im HOEnZb"><br>

On Thu, May 31, 2012 at 12:36 AM, Wolfgang Kerzendorf<br>

<<a href="mailto:wkerzendorf@gmail.com">wkerzendorf@gmail.com</a>> wrote:<br>

</div><div class="im HOEnZb">> Dear all,<br>

><br>

> I have an ndarray which consists of many arrays stacked behind each other (only conceptually, in truth it's a normal 1d float64 array).<br>

> I have a second array which tells me the start of the individual data sets in the 1d float64 array and another one which tells me the length.<br>

> Example:<br>

><br>

> data_array = (conceptually) [[1,2], [1,2,3,4], [1,2,3]] = in reality [1,2,1,2,3,4,1,2,3, dtype=float64]<br>

> start_pointer = [0, 2, 6]<br>

> length_data = [2, 4, 3]<br>

><br>

> I now want to normalize each of the individual data sets. I wrote a simple for loop over the start_pointer and length data grabbed the data and normalized it and wrote it back to the big array. That's slow. Is there an elegant numpy way to do that? Do I have to go the cython way?<br>


</div><div class="HOEnZb"><div class="h5">_______________________________________________<br>

NumPy-Discussion mailing list<br>

<a href="mailto:NumPy-Discussion@scipy.org">NumPy-Discussion@scipy.org</a><br>

<a href="http://mail.scipy.org/mailman/listinfo/numpy-discussion" target="_blank">http://mail.scipy.org/mailman/listinfo/numpy-discussion</a><br>

</div></div></blockquote></div><br>