[Numpy-discussion] NumPy2 design

Aug. 22, 2000

      As a followup to my previous post here is a discussion and overview of the
current NumPy2 design as I have it in my head and partially implemented in
the numpy2 module on the CVS tree at numpy's sourceforge site.

The design of NumPy2 is quite simple and tries to balance speed with
flexibility and modifiability.  An outline of the design follows (the
names can change, they are only reference at the moment)

Three classes replacing the current C structures:

ArrayType  --- replaces PyArray_Descr  # not implemented yet
Ufunc      --- replaces PyUfuncObject 
NDArray    --- replaces PyArrayObject

The purpose of each class is to encapsulate interfaces to allow code
re-use for similar operations.

NDArray       # This is implemented except for the 
              # operations (ufuncs)
========================

The most concrete class is the NDArray class --- it just needs coding
to make it happen.  The other classes still need some design work to
efficiently handle mixed type operations and additions of new types to the
system.  

The NDArray base class gives an N-Dimensional array interpretation to a
Python buffer (a segment of memory, an m-mapped file, a PIL image, etc.).
It provides this interpretation with three special attributes:

self.rank      --- the dimension of the array (hard-coded changeable limit 
                   of 10).
self._data     --- a buffer object pointing to the data
self._structure --- a buffer object pointing to an array of INTEGERS which
                    holds the dimensions and strides information
                    (INTEGER) is a platform-dependent type #defined in
                    compiled code
self._descr    --- Python class describing the type.

To interact seamlessly with the C-API and be recognized as "an array" all
subclasses must either export an __array__ method which creates a suitable
NDArray or not interfere with these provided  attributes.

Note that the same data segment can be viewed in several different ways. 

The NDArray will have default implementations for the numeric operations
that will resemble the current implementation.  But, it will be easy to
subclass the array to handle these operations as you'd like without losing
the ability to use the data in that array in extension modules which
assume array inputs.  

Two other attributes are worth mentioning:

self.CONTIGUOUS  # this can be determined from the _structure information
                 #  but it is useful to keep a flag around indicating the 
	         #  status.   This tells you whether or not you can
	         #  walk through the entire array an element at a
                 #  time with a single for loop.

self.FORTRANVIEW # This basically indicates how the array will view it's 
	         # shape when asked and indexed (This does not change the 
	         # _structure information).  An array of "shape"
                 # (10,3,5,7) when
		 # FORTRANVIEW is 0 will be an array of "shape" (7,5,3,10)
		 # when FORTRANVIEW is 1

ArrayType:  # This is not implemented yet.
===============================

This class is to replace the PyArray_Descr structure in current Numerical
Python.  As a result, it must contain the information:

self.name    ---- some kind of object to identify it (a string)
self.elsize  ---- size of an item of this kind.
self.cast    ---- a dictionary of compiled functions with at least one
                  entry called to cast this type to at least one other
                  type
self.getitem ---- (Compiled) function
self.setitem ---- (Compiled) function

self.zeros   ---- needed for the zeros command to include arrays of Python 
	          Objects.  It points to the representation of zero for
                  this type.

Currently, the above is not implemented, yet.  What is implemented is a
module _arraytypes which exposes to Python the PyArray_Descr structure so
that it can be used.  The idea of adding new types to Numerical Python
without having to change all of the code is appealing to me, however.

Ufunc:      # This is partly implemented
==========================================
There are two ideas here.  I've partly implemented the first one which
I'll explain. The second was presented to me by Paul Barrett.

Ufunc's are encapsulations of the N-D looping construct and the
broadcasting rules of Numerical Python.  The N-D looping construct is
limited to the fixed but arbitrary 10 dimensions as given above for C
code but can be arbitrary if a Python function is called at each
iteration.  

I explain Ufuncs in a piece called Ufuncexplain.txt which is on the CVS
tree.  Here is a quote that explains broadcasting rules:

	1) If input arrays do not have the same rank.  The array with
           lower rank will be prepended with ones until ranks agree. 

	2) If an input array has length one, then "duplicate" the
           elements along that dimension so that input shapes agree.

	Example:

            A is an array of shape (10,)
	    B is an array of shape (3,10)

	    A * B will return an array of shape (3,10):
               - A is interpreted as shape (1,10)
               - the columns of A are "broadcast" across the rows of B

	    Thus the output is (3,10):
	         [ A*B[0]
		   A*B[1]
		   A*B[2]
		 ]

        Note that A is not actually extended to a (3,10) array.  It
        merely behaves as if it had been.

	The element-wise math operators are implemented using Ufuncs.

SpecialFuncs is a Python package at http://oliphant.netpedia.net that
impelements a whole range of special functions using the Ufunc formalism.

I also include in that package a general arraymap function which can turn
any Python function into a broadcasting "ufunc-mimic"  This code does not
have the rank 10 limit on the number of dimensions on the inputs -- but it
might be slower than the current implementation.

My current implementation assumes that the Ufunc instantiator will provide
two functions:  a select function and a compute function (either of these
can be in C or Python).  The select function and the compute function work
together.  The select function determines the type of the outputs based on
the input types, while the compute function takes the inputs and outputs 
(and their types) and computes the ouptut.  This is done on either an
entire block of memory (optimized ufuncs) or one-element-at-a-time
(unoptimized ufuncs---Python coded ufuncs for example). 

This allows for efficient coding and the possibility of mixed type
arithmetic with a more complicated creation process.  It also may be hard
to add new types and have them function as you'd like without modifying
others already-defined ufuncs.  But, I know this idea will work and I can
see my way through it.  I've already implemented an "addition" function
using this method.

Another idea that has been presented is to instantiate a Ufunc with only
one function that is entered into a "dispatch table" or dictionary of
functions keyed by the ArrayType class much like the Multimethod approach
that has been discussed.  I like this idea, but I do not see the details
(I haven't thought about it too much) and do not know if we can actually
make it work --- Frankly, I think we can and it will result in a better
system.  

What hasn't been thought through is exactly what is entered into
the "function" dictionary and when is it called.  Some have suggested that
it is called "immediately" upon ufunc call, but this would eliminate the
benefits of the encapsulated broadcasting rules.  An alternative would be
to call it after the "broadcasting rule encapsulation" as been done.  In
other words use it as a replacement for the current array of functions in
the Ufunc implementation.  With the appropriate mix of C-modules and
Python code I think this could be done quite elegantly.  

The other issue that has to be worked out (again) is that obviously this
table will not be filled out in every case (for a function with 10
inputs and 10 outputs with 16 types we are talking 16^20 different
entries in the table and will be sparse) so what is done when there is not 
an entry for a particular combination will require some thought.  I think
we could allow multiple behaviors according to some attribute of the ufunc
(casting, exception raising, etc.,) is set.  Many people might fear this
would inhibit code re-use but I have not seen convincing examples.

So, that is a brief overview of the state of things.  It doesn't try to
cover everything, but it should give you enough of a perspective to
understand the code that is on the CVS tree under module numpy2.  I have
been using C for the compiled code because it is easy to interface to and
it has the widest platform support and because Python istelf is written in
C.

Anybody with specific questions (including offers to help) can feel free
to contact me or post to this list.

Thanks,

Travis Oliphant