[Numpy-discussion] Re: Histograms via indirect index arrays

Fri Mar 17 13:32:02 EST 2006

On Friday 17 March 2006 16:04, Robert Kern wrote:
> Piotr Luszczek wrote:
> > On Friday 17 March 2006 14:58, Robert Kern wrote:
> >>Piotr Luszczek wrote:
> >>>By design numpy returns views from __getitem__
> >>
> >>Only for slices.
> >>
> >>In [132]: a = arange(10)
> >>
> >>In [133]: idx = [2,2,3]
> >>
> >>In [134]: a[idx]
> >>Out[134]: array([2, 2, 3])
> >>
> >>In [135]: b = a[idx]
> >>
> >>In [136]: b[-1] = 100
> >>
> >>In [137]: a
> >>Out[137]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
> >
> > Your example uses lists as indices. This is not interesting.
> > I'm talking solely about arrays indexing other arrays.
> > To me it is a special and very important case.
>
> The result is exactly the same.
>
> In [164]: a = arange(10)
>
> In [165]: idx = array([2,2,3])
>
> In [166]: b = a[idx]
>
> In [167]: b[-1] = 100
>
> In [168]: a
> Out[168]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>
> >>>In this case, it would be view into 'self' and 'idx' so the
> >>>__iadd__ would just use the 'idx' directly rather than a copy.
> >>>Finally, __setitem__ doesn't do anything since 'self' and 'value'
> >>>will be the same.
> >>
> >>No, value is the result of __iadd__ on the temporary array.
> >>
> >>'g[idx] += 1' expands to:
> >>
> >>  tmp = g.__getitem__(idx)
> >>  val = tmp.__iadd__(1)
> >>  g.__setitem__(idx, val)
> >
> > You're missing the point. 'tmp' can be of a very specific type
> > so that 'g.__setitem__' doesn't have to do anything: the 'add 1'
> > was done by '__iadd__'.
>
> No, I got your point just fine; I was correcting a detail.
>
> You would have to reimplement __getitem__ to return a new kind of
> object that represents a non-uniformly-strided array. If you want to
> get anywhere, go implement that object and come back. When we have
> something concrete to look at instead of vague assertions, then we
> can start tackling the issues of integrating it into the core such
> that 'g[idx] += 1' works like you want it to. For example, index
> arrays are used in more places than in-place addition. Your new type
> needs to be usable in all of those places since __getitem__, __iadd__
> and __setitem__ don't know that they are being called in that order
> and in that fashion.

This is a tough requirement but perfectly reasonable. So when my
day job let's me off the hook I'll give it a try.

> >>Given these class definitions:
> >>
> >>  class A(object):
> >>      def __getitem__(self, idx):
> >>          print 'A.__getitem__(%r)' % idx
> >>          return B()
> >>      def __setitem__(self, idx, value):
> >>          print 'A.__setitem__(%r, %r)' % (idx, value)
> >>
> >>
> >>  class B(object):
> >>      def __iadd__(self, x):
> >>          print 'B.__iadd__(%r)' % x
> >>          return self
> >>      def __repr__(self):
> >>          return 'B()'
> >>
> >>In [153]: a = A()
> >>
> >>In [154]: a[[0, 2, 2, 1]] += 1
> >>A.__getitem__([0, 2, 2, 1])
> >>B.__iadd__(1)
> >>A.__setitem__([0, 2, 2, 1], B())
> >>
> >>>Of course, this is just a quick draft. I don't know how it would
> >>>work in practice and in other cases.
> >>
> >>Aye, there's the rub.
> >
> > Show me a code that breaks.
>
> <shrug> Show us some code that works. I'm not interested in
> implementing your feature request. You are. There's plenty of work
> that you can do that doesn't depend on anyone else agreeing with you,
> so you can stop arguing and start coding.

Arguing is good to a point and I think you're right that it's time to
stop.