data:image/s3,"s3://crabby-images/ea862/ea862f61bb1d506993b0e162b403083f8a3d9210" alt=""
Hi, I'm looking for efficient ways to subtotal a 1-d array onto a 2-D grid. This is more easily explained in code that words, thus: for n in xrange(len(data)): totals[ i[n], j[n] ] += data[n] data comes from a series of PyTables files with ~200m rows. Each row has ~20 cols, and I use the first three columns (which are 1-3 char strings) to form the indexing functions i[] and j[], then want to calc averages of the remaining 17 numerical cols. I have tried various indirect ways of doing this with searchsorted and bincount, but intuitively they feel overly complex solutions to what is essentially a very simple problem. My work involved comparing the subtotals for various different segmentation strategies (the i[] and j[] indexing functions). Efficient solutions are important because I need to make many passes through the 200m rows of data. Memory usage is the easiest thing for me to adjust by changing how many rows of data to read in for each pass and then reusing the same array data buffers. Thanks in advance for any suggestions! Stephen