[Numpy-discussion] Thoughts on an ndarray super-class

Wed Feb 22 23:46:04 EST 2006

Sasha wrote:

>On 2/23/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
>  
>
>>...
>>I have been thinking, however, of replacing it with a super-class that
>>does not define the dimensions or strides.
>>
>>    
>>
>Having a simple 1-d array in numpy would be great.  In an ideal world
>I would rather see a 1-d array implemented in C together with a set of
>array operations that is rich enough to allow trivial implementation
>of ndarray in pure python.
>  
>
You do realize that this is essentially numarray, right?  And your dream 
of *rich enough* 1-d operations to allow *trivial* implementation may be 
a bit far-fetched, but I'm all for dreaming.

>When you say "does not define the dimensions or strides" do you refer
>to python interface or to C struct?  I thought python did not allow to
>add data members to object structs in subclasses.
>  
>
The C-struct.  

Yes, you can add data-members to object structs in sub-classes.  Every 
single Python Object does it.  The standard Python Object just defines 
PyObject_HEAD or PyObject_VAR_HEAD.  

This is actually the essence of inheritance in C and it is why 
subclasses written in C must have compatible memory layouts.  The first 
part of the C-structure must be identical, but you can add to it all you 
want.

It all comes down to:  Can I cast to the base-type C-struct and have 
everything still work out when I dereference a particular field? 

This will be true if  PyArrayObject is

struct {
PyBaseArrayObject
int nd
intp *dimensions
intp *strides
}

I suppose we could change the

int nd

to

intp nd

and place it in the PyBaseArrayObject where it would be used as a length

But, I don't really like that...

>>In other words, the default array would be just a block of memory.  The
>>standard array would inherit from the default and add dimension and
>>strides pointers.
>>
>>    
>>
>If python lets you do it, how will that block of memory know its size?
>
>
>  
>
It won't of course by itself unless you add an additional size field.  
Thus, I'm not really sure whether it's a good idea or not.   I don't 
like the idea of adding more and more fields to the basic C-struct that 
has been around for 10 years unless we have a good reason.   The other 
issue is that the data-pointer doesn't always refer to memory that the 
ndarray has allocated, so it's actually incorrect to think of the 
ndarray as both the block of memory and the dimensioned indexing. 

The memory pointer is just that (a memory pointer).  We are currently 
allowing ndarray's to create their own memory but that could easily 
change so that they always use some other object to allocate memory.

In short, I don't see how to really do it so that the base object is 
actually useable.