[Numpy-discussion] Hierarchical vs non-hierarchical ndarray.base and __array_interface__

Sat Dec 8 06:15:06 EST 2012

On Sat, Nov 24, 2012 at 8:34 PM, Gamblin, Todd <gamblin2 at llnl.gov> wrote:

> Hi all,
>
> I posted on the change in semantics of ndarray.base here:
>
>
> https://github.com/numpy/numpy/commit/6c0ad59#commitcomment-2153047
>
> And some folks asked me to post my question to the numpy mailing list.
>  I've implemented a tool for mapping processes in parallel applications to
> nodes in cartesian networks.  It uses hierarchies of numpy arrays to
> represent the domain decomposition of the application, as well as
> corresponding groups of processes on the network.  You can "map" an
> application to the network using assignment of through views.  The tool is
> here if anyone is curious:  https://github.com/tgamblin/rubik.  I used
> numpy to implement this because I wanted to be able to do mappings for
> arbitrary-dimensional networks.  Blue Gene/Q, for example, has a 5-D
> network.
>
> The reason I bring this up is because I rely on the ndarray.base pointer
> and some of the semantics in __array_interface__ to translate indices
> within my hierarchy of views.  e.g., if a value is at (0,0) in a view I
> want to know that it's actually at (4,4) in its immediate parent array.
>
> After looking over the commit I linked to above, I realized I'm actually
> relying on a lot of stuff that's not guaranteed by numpy.  I rely on .base
> pointing to its closest parent, and I rely on __array_interface__.data
> containing the address of the array's memory and its strides.  None of
> these is guaranteed by the API docs:
>
>         http://docs.scipy.org/doc/numpy/reference/arrays.interface.html
>

Are you saying that data/strides aren't guaranteed because they're marked
optional on that page? My interpretation of "optional" is that these fields
don't have to be present for all objects implementing something that
qualifies as an array interface (for example ndarrays don't need a mask),
but it does not mean that everything marked optional can be changed without
warning for ndarrays.

So I guess I have a few questions:
>
> 1. Is translating indices between base arrays and views something that
> would be useful to other people?
>
> 2. Is there some better way to do this than using ndarray.base and
> __array_interface__?
>
> 3. What's the numpy philosophy on this?  Should views know about their
> parents or not?  They obviously have to know a little bit about their
> memory, but whether or not they know how they were derived from their
> owning array is a different question.  There was some discussion on the
> vagueness of .base here:
>
>
> http://thread.gmane.org/gmane.comp.python.numeric.general/51688/focus=51703
>
> But it doesn't look like you're deprecating .base in 1.7, only changing
> its behavior, which I tend to agree is worse than deprecating it.
>
> After thinking about all this, I'm not sure what I would like to happen.
>  I can see the value of not keeping extra references around within numpy,
> and my domain is pretty different from the ways that I imagine people use
> numpy.  I wouldn't have to change my code much to make it work without
> .base, but I do rely on __array_interface__.  If that doesn't include the
> address and strides, t think I'm screwed as far as translating indices go.
>
> Any suggestions?
>

The discussion on .base seems to have converged. As for
__array_interface__, you could write a test which captures the essence of
your use of ndarray.__array_interface__ and send a PR for it.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20121208/e88227d1/attachment.html>