[Numpy-discussion] The array interface published

Scott Gilbert xscottg at yahoo.com
Mon Apr 4 19:09:30 EDT 2005

--- Michiel Jan Laurens de Hoon <mdehoon at ims.u-tokyo.ac.jp> wrote:
> I'm not sure what you mean by "the array interface could become
> part of the Python standard as early as Python 2.5", since there
> is nothing to install. Or does this mean that Python's array will
> conform to the array interface?

It would be nice to have the Python array module support the protocol for
the 1-Dimensional arrays that it implements.  It would also be nice to add
a *simple* ndarray object in the core that supports multi-dimensional
arrays.  I think breaking backward compatibility of the existing Python
array module to support multiple dimensions would be a mistake and unlikely
to get accepted.

A PEP would likely be required to make the changes to the array module, and
also to add an ndarray module would likely document the interface.  In that
regard, it could "make it into the core" for Python 2.5.

But you're right that external packages could support this interface today.
 There is nothing to install...

> 1) The "__array_shape__" method is identical to the existing "shape"
> method in Numerical Python and numarray (except that "shape" does a
> little bit better checking, but it can be added easily 
> to "__array_shape__"). To avoid code duplication, it might be better
> to keep that method. (and rename the other methods for consistency,
> if desired).

The intent is that all array packages would have the required/optional
protocol attributes.  Of course at a higher level, this information will
probably be presented to the users, but they might choose a different

So while A.__array_shape__ always returns a tuple of longs, A.shape is free
to return a ShapeObject or be an assignable attribute that changes the
shape of the object.  With the property mechanism, there is no need to
store duplicated data (__array_shape__ can be a property method that
returns a dynamically generated tuple).

Separating the low level description of the array data in memory from the
high level interface that particular packages like scipy.base or numarray
present to their users is a good thing.

> 3) Where do default values come from? Is it the responsability of the
> extension module writer to find out if the array module implements e.g.
> __array_strides__, and substitute the default values if it doesn't? If
> so, I have a slight preference to make all methods required, since it's
> not a big effort to return the defaults, and there will be more extension
> modules than array packages (or so I hope).

If we can get a *simple* package into the core, in addition to implementing
an ndarray object, this module could have helper functions that do this
sort of thing.  For instance:

    def get_strides(A):
        if hasattr(A, "__array_strides__"):
            return A.__array_strides__
        shape = A.__array_shape__
        size = get_itemsize(A)
        for i in range(len(shape)-1, -1, -1):
            size *= shape[i]
        return tuple(strides)

    def get_itemsize(A):
        typestr = A.__array_typestr__
        # skip the endian
        if typestr[0] in '<>': typestr = typestr[1:]
        # skip the char code
        typestr = typestr[1:]
        return long(typestr)

    def is_contiguous(A):
        # etc....

Those are probably buggy and need work, but you get the idea...  A C
implementation of the above would be easy to do and useful, and it could be
done inline in a single include file (no linking headaches).


More information about the NumPy-Discussion mailing list