[Numpy-discussion] problem with assigning to recarrays

Robert Kern robert.kern at gmail.com
Sat Feb 28 01:58:49 EST 2009


On Fri, Feb 27, 2009 at 19:06, Brian Gerke <bgerke at slac.stanford.edu> wrote:
>
> On Feb 27, 2009, at 4:30 PM, Robert Kern wrote:
>>>
>> r[where(r.field1 == 1.)] make a copy. There is no way for us to
>> construct a view onto the original memory for this circumstance given
>> numpy's memory model.
>
> Many thanks for the quick reply.  I assume that this is true only for
> record arrays, not for ordinary arrays?  Certainly I can make an
> assignment in this way with a normal array.

Well, you are doing two very different things. Let's back up a bit.

Python gives us two hooks to modify an object in-place with an
assignment: __setitem__ and __setattr__.

  x[<item>] = y   ==>  x.__setitem__(<item>, y)
  x.<attr>  = y   ==>  x.__setattr__('<attr>', y)

Now, we don't need to restrict ourselves to just variables for 'x'; we
can have any expression that evaluates to an object.

  (<expr>)[<item>] = y  ==> (<expr>).__setitem__(<item>, y)
  (<expr>).<attr>  = y  ==> (<expr>).__setattr__('<attr>', y)

The key here is that the (<expr>) on the LHS is evaluated just like
any expression appearing anywhere else in your code. The only special
in-place behavior is restricted to the *outermost* [<item>] or
.<attr>.

So when you do this:

  r[where(r.field1 == 1.)].field2 = 1.0

it translates to something like this:

  tmp = r.__getitem__(where(r.field1 == 1.0))  # Makes a copy!
  tmp.__setattr__('field2', 1.0)

Note that the first line is a __getitem__, not a __setitem__ which can
modify r in-place.

> Also, if it is truly impossible to change this behavior, or to have it
> raise an error--then are there any best-practice suggestions for how
> to remember and avoid running into this non-obvious behavior?  If one
> thinks of record arrays as inheriting  from numpy arrays, then this
> problem is certainly unexpected.

It's a natural consequence of the preceding rules. This a Python
thing, not a difference between numpy arrays and record arrays. Just
keep those rules in mind.

> Also, I've just found that the following syntax does do what is
> expected:
>
> (r.field2)[where(field1 == 1.)] = 1.
>
> It is at least a little aesthetically displeasing that the syntax
> works one way but not the other.  Perhaps my best bet is to stick with
> this syntax and forget that the other exists?  A less-than-satisfying
> solution, but workable.

If you drop the extraneous bits, it becomes a fair bit more readable:

  r.field2[r.field1 == 1] = 1

This is idiomatic; you'll see it all over the place where record
arrays are used. The reason that this form modifies r in-place is
because r.__getattr__('field2') is able to return a view rather than a
copy.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco



More information about the NumPy-Discussion mailing list