I would like to find the sample points where the running sum of some vector exceeds some threshold -- at those points I want to collect all the data in the vector since the last time the criteria was reached and compute some stats on it. For example, in python tot = 0. xs = [] ys = [] samples1 = [] for thisx, thisy in zip(x, y): tot += thisx xs.append(thisx) ys.append(thisy) if tot>=threshold: samples1.append(func(xs,ys)) tot = 0. xs = [] ys = [] The following is close in numpy sx = np.cumsum(x) n = (sx/threshold).astype(int) ind = np.nonzero(np.diff(n)>0)[0]+1 lasti = 0 samples2 = [] for i in ind: xs = x[lasti:i+1] ys = y[lasti:i+1] samples2.append(func(xs, ys)) lasti = i But the sample points in ind do no guarantee that at least threshold points are between the sample points due to truncation error. What is a good numpy way to do this? Thanks, JDH
participants (2)
-
John Hunter
-
Robert Kern