[Numpy-discussion] Hierarchical vs non-hierarchical ndarray.base and __array_interface__

Sat Nov 24 14:34:18 EST 2012

Hi all,

I posted on the change in semantics of ndarray.base here:

	https://github.com/numpy/numpy/commit/6c0ad59#commitcomment-2153047

And some folks asked me to post my question to the numpy mailing list.  I've implemented a tool for mapping processes in parallel applications to nodes in cartesian networks.  It uses hierarchies of numpy arrays to represent the domain decomposition of the application, as well as corresponding groups of processes on the network.  You can "map" an application to the network using assignment of through views.  The tool is here if anyone is curious:  https://github.com/tgamblin/rubik.  I used numpy to implement this because I wanted to be able to do mappings for arbitrary-dimensional networks.  Blue Gene/Q, for example, has a 5-D network.

The reason I bring this up is because I rely on the ndarray.base pointer and some of the semantics in __array_interface__ to translate indices within my hierarchy of views.  e.g., if a value is at (0,0) in a view I want to know that it's actually at (4,4) in its immediate parent array.

After looking over the commit I linked to above, I realized I'm actually relying on a lot of stuff that's not guaranteed by numpy.  I rely on .base pointing to its closest parent, and I rely on __array_interface__.data containing the address of the array's memory and its strides.  None of these is guaranteed by the API docs:

	http://docs.scipy.org/doc/numpy/reference/arrays.interface.html

So I guess I have a few questions:

1. Is translating indices between base arrays and views something that would be useful to other people?

2. Is there some better way to do this than using ndarray.base and __array_interface__?

3. What's the numpy philosophy on this?  Should views know about their parents or not?  They obviously have to know a little bit about their memory, but whether or not they know how they were derived from their owning array is a different question.  There was some discussion on the vagueness of .base here:

	http://thread.gmane.org/gmane.comp.python.numeric.general/51688/focus=51703

But it doesn't look like you're deprecating .base in 1.7, only changing its behavior, which I tend to agree is worse than deprecating it.

After thinking about all this, I'm not sure what I would like to happen.  I can see the value of not keeping extra references around within numpy, and my domain is pretty different from the ways that I imagine people use numpy.  I wouldn't have to change my code much to make it work without .base, but I do rely on __array_interface__.  If that doesn't include the address and strides, t think I'm screwed as far as translating indices go.

Any suggestions?

Thanks!
-Todd

______________________________________________________________________
Todd Gamblin, tgamblin at llnl.gov, http://people.llnl.gov/gamblin2
CASC @ Lawrence Livermore National Laboratory, Livermore, CA, USA