[Numpy-discussion] Need a good idea: calculate the mean of many vectors

Pauli Virtanen pav at iki.fi
Tue Feb 8 11:33:42 EST 2011


Tue, 08 Feb 2011 15:24:10 +0000, EMMEL Thomas wrote:
[clip]
> n = 100 # for test otherwise ~300000
> a1 = reshape(zeros(3*n).astype(float), (n,3))
> 
> a2 = zeros(n).astype(int)
> 
> for indices, data in [...]:
>     #data = array((1.,2.,3.))
>     #indices = (1,5,60)
>     for index in indices:
>         a1[index] += data
>         a2[index] += 1

You can (mis-)use `bincount` to vectorize summations.
Bincount does not support broadcasting and takes only 1-D
inputs, so some manual shape manipulation is necessary.

Something probably could/should be done to optimize Numpy's
performance for small arrays.

----

import numpy as np

n = 100  # 100 nodes
m = 1000 # 1000 triangles

# synthetic data

tri_nodes = np.random.randint(0, n, size=(m, 3))
tri_data = np.random.rand(m, 3)

# vectorize sums via bincount

a1 = np.zeros((n,3), float)
a2 = np.bincount(tri_nodes.ravel())

for j in range(3):
    # repeat(..., 3) -> the same (3,) data vector is added to each
    # node in the triangle
    a1[:,j] = np.bincount(tri_nodes.ravel(), np.repeat(tri_data[:,j], 3))




More information about the NumPy-Discussion mailing list