[pypy-dev] rpython numpy

Tue Sep 25 18:38:00 CEST 2007

I've been having some fun working on a (tiny) rpython version of numpy.

At the heart of numpy is a multi-dimensional array type, ndarray.
For example it (the cpython version) can handle processing image arrays like so:

>>> import numpy
>>> from PIL import Image
>>> im=Image.open('jit.png')
>>> data=numpy.array(im)

now data is a 3 dimensional array, data.ndim==3 and data.shape==(1158, 1173, 3).
The last dimension is the color dimension.

The cool part of numpy is all the indexing tricks.
This is a single pixel:

>>> data[0,0,:]
array([224, 255, 224], dtype=uint8)

this is the red channel:

>>> data[:,:,0]
array([[224, 224, 224, ..., 224, 224, 224],
...

now we mix red and blue:

>>> data[:,:,0]+data[:,:,2]

(this actually didn't work so well because it's an array of uint8
and we get overflow, but that can be fixed with a cast)

The trick with numpy is, every time we use an indexing operation
it creates a "view" onto the original data, ie. the memory is
aliased. Each ndarray stores a list of "strides" (integers) which specify
how many bytes to skip when incrementing an index in a dimension:

>>> data.strides
(3519, 3, 1)

>>> data[:,:,0].strides
(3519, 3)

Then numpy (at the c level) uses iterators to perform getattr, setattr, and other
(pointwise) operations.

"broadcasting" allows a smaller array to be used as a larger array, as long
as the shape of the smaller array is a suffix of the shape of the larger.

>>> green = numpy.array([0,255,0], dtype=numpy.uint8)
>>> data[:] = green
This sets every pixel to green.

I made a design choice with rnumpy to specialize on the number of
dimensions (ndim) of an array.
This means that we can generate efficient code (more about this below);
which was just too tempting to ignore. But also,

* it would be impossible to access arbitrary ndim arrays from
  a pypy wrapped rnumpy. 
* the shape attribute is read-only. To change the shape of an array,
  use the reshape method (this does not copy the data):

>>> numpy.array(range(12)).reshape((3,4))
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

The above mentioned "efficient code" is perhaps a non-issue. Numpy
code tends to become memory-bound, not cpu-bound. However it is
an interesting use of the rpython optimiser. In general, to iterate
over an n-dimensional array requires n-nested for loops, or the
use of a "coordinate" array. In this implementation (as in the
c implementation) there is an iterator object to hold the coordinates
(and the corresponding strides). With malloc removal, all this iterator
data become local variables. It's nifty.

I don't really know where this is all going; an obvious goal
would be to extend all this with a variable dimension array type
(and write an applevel numpy). One of the original ideas was
to somehow produce cache friendly code by doing lazy operations.

Another issue is that it has been mind-boggleingly difficult to directly
extend the compiler. I would like to rethink this whole approach;
perhaps special methods can help, or some other "low-level to high-level"
interaction.

Simon.