[Numpy-discussion] indexed arrays ignoring duplicates

Robert Kern robert.kern at gmail.com
Wed Sep 29 12:15:08 EDT 2010


On Wed, Sep 29, 2010 at 01:01, Damien Morton <dmorton at bitfurnace.com> wrote:
> lets say i have arrays:
>
> a = array((1,2,3,4,5))
> indices = array((1,1,1,1))
>
> and i perform operation:
>
> a[indices] += 1
>
> the result is
>
> array([1, 3, 3, 4, 5])
>
> in other words, the duplicates in indices are ignored
>
> if I wanted the duplicates not to be ignored, resulting in:
>
> array([1, 6, 3, 4, 5])
>
> how would I go about this?

Use numpy.bincount() instead. The reason for the current behavior is
that Python compiles the "x[i] += y" construct into three separate
orthogonal operations.

  tmp = x.__getitem__(i)
  val = tmp.__iadd__(y)
  x.__setitem__(i, val)

Each of these operations has well-defined semantics in numpy arrays
primarily designed for other use cases. There is no way for each of
them to know that they are in the "x[i] += y" idiom in order to do
something different to achieve the semantics you want.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco



More information about the NumPy-Discussion mailing list