[Python-Dev] Expose the array interface in Python 2.5?

Chris Barker Chris.Barker at noaa.gov
Mon Mar 20 23:52:29 CET 2006


On 3/17/06, Thomas Heller <theller <at> python.net> wrote:
 > Accessing Python arrays (Numeric arrays, Numeric array, or Numpy
 > array) as ctypes arrays, and vice versa, without copying any memory,
 > would be a good thing.

This does bring up a point. I was thinking that a really bare-bones 
nd-array type would be very useful by itself as a way to exchange data. 
However, if ctypes were to return nd-arrays from C function calls, then 
we would want at least a rudimentary way of directly accessing the data. 
I'd sure like to see indexing and slicing, at at least. Also if ctypes 
were to allow nd-arrays as a way to pass data into a C function, then 
you'd need a build-in python way to create the data in the first place.

If that's too much for now, I'd still like to have the basic data 
structure defined in the standard library (or built-in) as soon as possible.

Greg Ewing wrote:

 > It might be all right for writers of new
 > extensions, but there are existing modules (PIL, ctypes, etc.)
 > that already have their own way of storing data, and it
 > seems to me it would be easier for the maintainers of
 > those modules to add a new interface to the existing
 > data than to rearrange their internal structure to use
 > this new C-object.

Can we have both? A defined interface, that existing code can be adapted 
to provide, and a new C-Object, that future code can just use. If the 
goal is to have as many extension types as possible use the same base 
object, the sooner a standard object is provided the better.

There are those of us in the scientific computing community that would 
love to just have numpy be part of the standard library. It really is 
big, and maybe too special purpose for that, but at least having the 
core array in there would be great.

For years, I've been dealing with modules like wxPython that don't 
understand numpy arrays. As a result, passing in a list of points to be 
drawn is actually faster than passing in a numpy array of points, even 
though the numpy array of points has the data in the exact binary 
representation that wxWidgets expects. The problem is that the wrapper 
code doesn't know that, because Robin (quite reasonably) didn't want to 
have wxPython depend on Numeric. While a standard interface could 
support this, it would be great if wxPython could also do things like 
pass an image buffer back to Python efficiently.

Another point is that n-dimensional arrays really are very useful for 
all sorts of stuff that have nothing to do with high-performance Numeric 
computing. I find myself using numpy for almost every little script I 
write, even though most of it is not performance bounded at all. I 
suspect that if we get a n-dimensional array type into Python, one that 
allows n-d slicing, it will see a LOT of use by people who now think 
they have no use for numpy. My guess is that a 2-dimensional object 
array would get the most use, but why not support n-d while you're at 
it? Having an easy and natural way to store and manipulate a "table" of 
objects, for instance, would be handy for many, many users.

I'm still a tiny bit confused about the proposed individual pieces 
involved, but I'd like to see, in order of priority (and difficulty):

1) A standard n-d array interface (numpy already defines this, but 
outside of the numpy community, who knows about it?)

2) A standard, n-d array C-Object, inheritable for use by various 
extension packages.

3) A standard n-d array object, with a python interface for indexing and 
slicing for access and assignment (modeled after, or better yet taken 
directly from, numpy).

4) A standard n-d array object that supports array-wise arithmetic and 
functions (ufuncs, in numpy parlance).

There is no reason it has to have anything to do with the current array 
module. For instance, having the arrays be of fixed size is just fine. 
It's really not that big a deal to make a new one when you need to 
change the size, after all, we live with that for immutable objects, 
like strings.

just my $0.02

-Chris




More information about the Python-Dev mailing list