Mailman 3 nonuniform scatter operations - NumPy-Discussion - python.org

newer
Numpy Nu-bee: "forward fill"...

nonuniform scatter operations

older
problem calling SD.SDim.setname in...

Geoffrey Irving

28 Sep 2008 28 Sep '08

4:34 a.m.

Hello, Is there an efficient way to implement a nonuniform gather operation in numpy? Specifically, I want to do something like n,m = 100,1000 X = random.uniform(size=n) K = random.randint(n, size=m) Y = random.uniform(size=m) for k,y in zip(K,Y): X[k] += y but I want it to be fast. The naive attempt "X[K] += Y" does not work, since the slice assumes the indices don't repeat. Thanks, Geoffrey

Reply

Sign in to reply online Use email software

Show replies by date

Nathan Bell

28 Sep 28 Sep

5:01 a.m.

On Sun, Sep 28, 2008 at 12:34 AM, Geoffrey Irving <irving@naml.us> wrote:

Is there an efficient way to implement a nonuniform gather operation in numpy? Specifically, I want to do something like

n,m = 100,1000 X = random.uniform(size=n) K = random.randint(n, size=m) Y = random.uniform(size=m)

for k,y in zip(K,Y): X[k] += y

but I want it to be fast. The naive attempt "X[K] += Y" does not work, since the slice assumes the indices don't repeat.

I don't know of numpy solution, but in scipy you could use a sparse matrix to perform the operation. I think the following does what you want. from scipy.sparse import coo_matrix X += coo_matrix( (Y, (K,zeros(m,dtype=int)), shape=(n,1)).sum(axis=1) This reduces to a simple C++ loop, so speed should be good: http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/sparsetools... -- Nathan Bell wnbell@gmail.com http://graphics.cs.uiuc.edu/~wnbell/

Reply

Sign in to reply online Use email software

Geoffrey Irving

8:15 p.m.

On Sat, Sep 27, 2008 at 10:01 PM, Nathan Bell <wnbell@gmail.com> wrote:

On Sun, Sep 28, 2008 at 12:34 AM, Geoffrey Irving <irving@naml.us> wrote:

...
Is there an efficient way to implement a nonuniform gather operation in numpy? Specifically, I want to do something like

n,m = 100,1000 X = random.uniform(size=n) K = random.randint(n, size=m) Y = random.uniform(size=m)

for k,y in zip(K,Y): X[k] += y

but I want it to be fast. The naive attempt "X[K] += Y" does not work, since the slice assumes the indices don't repeat.

I don't know of numpy solution, but in scipy you could use a sparse matrix to perform the operation. I think the following does what you want.

from scipy.sparse import coo_matrix X += coo_matrix( (Y, (K,zeros(m,dtype=int)), shape=(n,1)).sum(axis=1)

This reduces to a simple C++ loop, so speed should be good: http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/sparsetools...

Thanks. That works great. A slightly cleaner version is X += coo_matrix((Y, (K, zeros_like(K)))).sum(axis=1) The next question is: is there a similar way that generalizes to the case where X is n by 3 and Y is m by 3 (besides the obvious loop over range(3), that is)? Geoffrey

Reply

Sign in to reply online Use email software

Nathan Bell

30 Sep 30 Sep

5:38 a.m.

On Sun, Sep 28, 2008 at 4:15 PM, Geoffrey Irving <irving@naml.us> wrote:

Thanks. That works great. A slightly cleaner version is

X += coo_matrix((Y, (K, zeros_like(K)))).sum(axis=1)

The next question is: is there a similar way that generalizes to the case where X is n by 3 and Y is m by 3 (besides the obvious loop over range(3), that is)?

You could flatten the arrays and make a single matrix that implemented the operation. I'd stick with the loop over range(3) though, it's more readable and likely to be as fast or faster than flattening the arrays yourself. -- Nathan Bell wnbell@gmail.com http://graphics.cs.uiuc.edu/~wnbell/

Reply

Sign in to reply online Use email software

Anne Archibald

28 Sep 28 Sep

6:50 a.m.

2008/9/28 Geoffrey Irving <irving@naml.us>:

Is there an efficient way to implement a nonuniform gather operation in numpy? Specifically, I want to do something like

n,m = 100,1000 X = random.uniform(size=n) K = random.randint(n, size=m) Y = random.uniform(size=m)

for k,y in zip(K,Y): X[k] += y

but I want it to be fast. The naive attempt "X[K] += Y" does not work, since the slice assumes the indices don't repeat.

I believe histogram can be persuaded to do this. Anne

Reply

Sign in to reply online Use email software

5888

Age (days ago)

5890

Last active (days ago)

Download

4 comments

3 participants

tags

participants (3)

Anne Archibald
Geoffrey Irving
Nathan Bell