Mailman 3 indexed arrays ignoring duplicates - NumPy-Discussion

indexed arrays ignoring duplicates

Damien Morton

29 Sep 2010 29 Sep '10

1:01 a.m.

lets say i have arrays: a = array((1,2,3,4,5)) indices = array((1,1,1,1)) and i perform operation: a[indices] += 1 the result is array([1, 3, 3, 4, 5]) in other words, the duplicates in indices are ignored if I wanted the duplicates not to be ignored, resulting in: array([1, 6, 3, 4, 5]) how would I go about this? the example above is somewhat trivial, what follows is exactly what I am trying to do: def inflate(self,pressure): faceforces = pressure * cross(self.verts[self.faces[:,1]]-self.verts[self.faces[:,0]], self.verts[self.faces[:,2]]-self.verts[self.faces[:,0]]) self.verts[self.faces[:,0]] += faceforces self.verts[self.faces[:,1]] += faceforces self.verts[self.faces[:,2]] += faceforces def constrain_lengths(self): vectors = self.verts[self.constraints[:,1]] - self.verts[self.constraints[:,0]] lengths = sqrt(sum(square(vectors), axis=1)) correction = 0.5 * (vectors.T * (1 - (self.restlengths / lengths))).T self.verts[self.constraints[:,0]] += correction self.verts[self.constraints[:,1]] -= correction def compute_normals(self): facenormals = cross(self.verts[self.faces[:,1]]-self.verts[self.faces[:,0]], self.verts[self.faces[:,2]]-self.verts[self.faces[:,0]]) self.normals.fill(0) self.normals[self.faces[:,0]] += facenormals self.normals[self.faces[:,1]] += facenormals self.normals[self.faces[:,2]] += facenormals lengths = sqrt(sum(square(self.normals), axis=1)) self.normals = (self.normals.T / lengths).T Ive been getting some very buggy results as a result of duplicates being ignored in my indexed inplace add/sub operations.

Show replies by date

Robert Kern

29 Sep 29 Sep

11:15 a.m.

On Wed, Sep 29, 2010 at 01:01, Damien Morton <dmorton@bitfurnace.com> wrote:

...

lets say i have arrays:

a = array((1,2,3,4,5)) indices = array((1,1,1,1))

and i perform operation:

a[indices] += 1

the result is

array([1, 3, 3, 4, 5])

in other words, the duplicates in indices are ignored

if I wanted the duplicates not to be ignored, resulting in:

array([1, 6, 3, 4, 5])

how would I go about this?

Use numpy.bincount() instead. The reason for the current behavior is that Python compiles the "x[i] += y" construct into three separate orthogonal operations. tmp = x.__getitem__(i) val = tmp.__iadd__(y) x.__setitem__(i, val) Each of these operations has well-defined semantics in numpy arrays primarily designed for other use cases. There is no way for each of them to know that they are in the "x[i] += y" idiom in order to do something different to achieve the semantics you want. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

Pauli Virtanen

noon

Wed, 29 Sep 2010 11:15:08 -0500, Robert Kern wrote: [clip: inplace addition with duplicates]

...

Use numpy.bincount() instead.

It might be worthwhile to add a separate helper function for this purpose. Bincount makes a copy that could be avoided, and it is difficult to find if you don't know about this trick. -- Pauli Virtanen

Robert Kern

12:28 p.m.

On Wed, Sep 29, 2010 at 12:00, Pauli Virtanen <pav@iki.fi> wrote:

...

Wed, 29 Sep 2010 11:15:08 -0500, Robert Kern wrote: [clip: inplace addition with duplicates]

...
Use numpy.bincount() instead.

It might be worthwhile to add a separate helper function for this purpose. Bincount makes a copy that could be avoided, and it is difficult to find if you don't know about this trick.

I'm fairly certain that most of the arrays used are fairly small, as such things are reckoned. I'm not sure that in-place modification would win us much. And I'm not sure what other name for the function would make it easier to find. AFAICT, using bincount() this way is not really a "trick"; it's just the right way to do exactly this job. If anything, "x.fill(0);x[i] += 1" is the "trick". -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

Damien Morton

8:03 p.m.

On Thu, Sep 30, 2010 at 3:28 AM, Robert Kern <robert.kern@gmail.com> wrote:

...

On Wed, Sep 29, 2010 at 12:00, Pauli Virtanen <pav@iki.fi> wrote:

...
Wed, 29 Sep 2010 11:15:08 -0500, Robert Kern wrote: [clip: inplace addition with duplicates]

...
Use numpy.bincount() instead.

It might be worthwhile to add a separate helper function for this purpose. Bincount makes a copy that could be avoided, and it is difficult to find if you don't know about this trick.

I'm fairly certain that most of the arrays used are fairly small, as such things are reckoned. I'm not sure that in-place modification would win us much. And I'm not sure what other name for the function would make it easier to find. AFAICT, using bincount() this way is not really a "trick"; it's just the right way to do exactly this job. If anything, "x.fill(0);x[i] += 1" is the "trick".

bincount only works for gathering/accumulating scalars. Even the 'weights' parameter is limited to scalars. I propose the name 'gather()' for the helper function that does this.

josef.pktd＠gmail.com

8:11 p.m.

On Wed, Sep 29, 2010 at 9:03 PM, Damien Morton <dmorton@bitfurnace.com> wrote:

...

On Thu, Sep 30, 2010 at 3:28 AM, Robert Kern <robert.kern@gmail.com> wrote:

...
On Wed, Sep 29, 2010 at 12:00, Pauli Virtanen <pav@iki.fi> wrote:

...
Wed, 29 Sep 2010 11:15:08 -0500, Robert Kern wrote: [clip: inplace addition with duplicates]

...
Use numpy.bincount() instead.

It might be worthwhile to add a separate helper function for this purpose. Bincount makes a copy that could be avoided, and it is difficult to find if you don't know about this trick.

I'm fairly certain that most of the arrays used are fairly small, as such things are reckoned. I'm not sure that in-place modification would win us much. And I'm not sure what other name for the function would make it easier to find. AFAICT, using bincount() this way is not really a "trick"; it's just the right way to do exactly this job. If anything, "x.fill(0);x[i] += 1" is the "trick".

bincount only works for gathering/accumulating scalars. Even the 'weights' parameter is limited to scalars.

Do you mean that bincount only works with 1d arrays? I also think that this is a major limitation of it.

...

I propose the name 'gather()' for the helper function that does this.

I don't think "gather" is an obvious name to search for. Josef

...

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Robert Kern

10:17 p.m.

On Wed, Sep 29, 2010 at 20:11, <josef.pktd@gmail.com> wrote:

...

On Wed, Sep 29, 2010 at 9:03 PM, Damien Morton <dmorton@bitfurnace.com> wrote:

...
On Thu, Sep 30, 2010 at 3:28 AM, Robert Kern <robert.kern@gmail.com> wrote:

...
On Wed, Sep 29, 2010 at 12:00, Pauli Virtanen <pav@iki.fi> wrote:

...
Wed, 29 Sep 2010 11:15:08 -0500, Robert Kern wrote: [clip: inplace addition with duplicates]

...
Use numpy.bincount() instead.

It might be worthwhile to add a separate helper function for this purpose. Bincount makes a copy that could be avoided, and it is difficult to find if you don't know about this trick.

I'm fairly certain that most of the arrays used are fairly small, as such things are reckoned. I'm not sure that in-place modification would win us much. And I'm not sure what other name for the function would make it easier to find. AFAICT, using bincount() this way is not really a "trick"; it's just the right way to do exactly this job. If anything, "x.fill(0);x[i] += 1" is the "trick".

bincount only works for gathering/accumulating scalars. Even the 'weights' parameter is limited to scalars.

Do you mean that bincount only works with 1d arrays? I also think that this is a major limitation of it.

Feel free to change it. I think that extending the weights array to allow greater dimensions is an obvious extension of the current semantics. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

Damien Morton

10:24 p.m.

On Thu, Sep 30, 2010 at 11:11 AM, <josef.pktd@gmail.com> wrote:

...

...
bincount only works for gathering/accumulating scalars. Even the 'weights' parameter is limited to scalars.

Do you mean that bincount only works with 1d arrays? I also think that this is a major limitation of it.

...

...
...
from numpy import * a = array((1,2,2,3,3)) w = array(((1,2),(3,4),(5,6),(7,8),(9,10))) bincount(a,weights=w) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: object too deep for desired array w0 = array((1,2,3,4,5)) bincount(a,weights=w0) array([ 0., 1., 5., 9.])

...

...
I propose the name 'gather()' for the helper function that does this.

I don't think "gather" is an obvious name to search for.

"gather" is the name that the GPGPU community uses to describe this kind of operation. Not just for summation but for any kind of indexed reducing operation.

Damien Morton

10:41 p.m.

...

...
...
I propose the name 'gather()' for the helper function that does this.

I don't think "gather" is an obvious name to search for.

"gather" is the name that the GPGPU community uses to describe this kind of operation. Not just for summation but for any kind of indexed reducing operation.

gather(a,to_indices, b, from_indices, reduce_op)

josef.pktd＠gmail.com

11:07 p.m.

On Wed, Sep 29, 2010 at 11:24 PM, Damien Morton <dmorton@bitfurnace.com> wrote:

...

On Thu, Sep 30, 2010 at 11:11 AM, <josef.pktd@gmail.com> wrote:

...
...
bincount only works for gathering/accumulating scalars. Even the 'weights' parameter is limited to scalars.

Do you mean that bincount only works with 1d arrays? I also think that this is a major limitation of it.

...
...
...
from numpy import * a = array((1,2,2,3,3)) w = array(((1,2),(3,4),(5,6),(7,8),(9,10))) bincount(a,weights=w) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: object too deep for desired array w0 = array((1,2,3,4,5)) bincount(a,weights=w0) array([ 0., 1., 5., 9.])

Since I'm not a C person to change bincount, how about

...

...
...
a = np.array((1,2,2,3,3)) w = np.array(((1,2),(3,4),(5,6),(7,8),(9,10))) a2 = np.array((1,2,2,3,3))[:,None]-1 + np.array([0, a.max()]) a array([1, 2, 2, 3, 3]) w array([[ 1, 2], [ 3, 4], [ 5, 6], [ 7, 8], [ 9, 10]]) np.bincount(a2.ravel(),weights=w.ravel()).reshape(2,-1).T array([[ 1., 2.], [ 8., 10.], [ 16., 18.]])

I never thought of doing this before and I have been using bincount for some time.

...

...
...
I propose the name 'gather()' for the helper function that does this.

I don't think "gather" is an obvious name to search for.

"gather" is the name that the GPGPU community uses to describe this kind of operation. Not just for summation but for any kind of indexed reducing operation.

Some group functions that Travis is planning, might go in this direction. Josef

...

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

josef.pktd＠gmail.com

11:45 p.m.

On Thu, Sep 30, 2010 at 12:07 AM, <josef.pktd@gmail.com> wrote:

...

On Wed, Sep 29, 2010 at 11:24 PM, Damien Morton <dmorton@bitfurnace.com> wrote:

...
On Thu, Sep 30, 2010 at 11:11 AM, <josef.pktd@gmail.com> wrote:

...
...
bincount only works for gathering/accumulating scalars. Even the 'weights' parameter is limited to scalars.

Do you mean that bincount only works with 1d arrays? I also think that this is a major limitation of it.

...
...
...
from numpy import * a = array((1,2,2,3,3)) w = array(((1,2),(3,4),(5,6),(7,8),(9,10))) bincount(a,weights=w) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: object too deep for desired array w0 = array((1,2,3,4,5)) bincount(a,weights=w0) array([ 0., 1., 5., 9.])

Since I'm not a C person to change bincount, how about

...
...
...
a = np.array((1,2,2,3,3)) w = np.array(((1,2),(3,4),(5,6),(7,8),(9,10))) a2 = np.array((1,2,2,3,3))[:,None]-1 + np.array([0, a.max()]) a array([1, 2, 2, 3, 3]) w array([[ 1, 2], [ 3, 4], [ 5, 6], [ 7, 8], [ 9, 10]]) np.bincount(a2.ravel(),weights=w.ravel()).reshape(2,-1).T array([[ 1., 2.], [ 8., 10.], [ 16., 18.]])

I never thought of doing this before and I have been using bincount for some time.

for future search, this seems to work

...

...
...
w = np.arange(5*4).reshape(5,4) a = np.random.randint(5,8, size=5) a array([6, 5, 6, 5, 7]) w array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19]])

...

...
...
a2 = a[:,None]-a.min() + (a.ptp()+1) * np.arange(w.shape[1]) np.bincount(a2.ravel(),weights=w.ravel()).reshape(w.shape[1],-1).T array([[ 16., 18., 20., 22.], [ 8., 10., 12., 14.], [ 16., 17., 18., 19.]])

will include row of zeros for indices between a.min and a.max that have zero count Josef

...

...
...
...
I propose the name 'gather()' for the helper function that does this.

I don't think "gather" is an obvious name to search for.

"gather" is the name that the GPGPU community uses to describe this kind of operation. Not just for summation but for any kind of indexed reducing operation.

Some group functions that Travis is planning, might go in this direction.

Josef

...
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

4969

Age (days ago)

4970

Last active (days ago)

List overview

Download

10 comments

4 participants

participants (4)

Damien Morton
josef.pktd＠gmail.com
Pauli Virtanen
Robert Kern

indexed arrays ignoring duplicates

Damien Morton

Robert Kern

Pauli Virtanen

Robert Kern

Damien Morton

josef.pktd＠gmail.com

Robert Kern

Damien Morton

Damien Morton

josef.pktd＠gmail.com

josef.pktd＠gmail.com

tags

participants (4)