[PYTHON MATRIX-SIG] Final matrix object renaming and packaging

James Hugunin jjh@Goldilocks.LCS.MIT.EDU
Tue, 16 Jan 96 14:07:25 EST


   From: "Jim Fulton, U.S. Geological Survey" <jfulton@usgs.gov>

   > I guess we run on faster networks at MIT, I never bother with dynamic
   > linking, and find that 4MB binaries launch as fast as I wish.

   Just out of curiousity, how fast is that?  What do you get from:

     time python -c ''

   On some of our slower systems, this takes about a second, even with
   almost all modules dynamically loaded.  This is much slower than I'd
   like it to be, since Python is often used here for very small scripts
   in which this startup time is a significant part of the overall
   execution time.  Starting up another version of the interpreter with
   Tk linked in takes nearly twice as long.

~:jjh@baribal: ls -l ~/PythonSun4/python
-rwxrwxr-x  1 jjh       3227648 Jan 16 11:09 /usr/users/jjh/PythonSun4/python*
~:jjh@baribal: time ~/PythonSun4/python -c ''
0.060u 0.060s 0:00.14 85.7% 0+234k 0+0io 0pf+0w


If you're curious about the details:

All our active files are stored on a Pentium box running a
high-performance NFS server kernel (no real operating system) this
links into the network through a 100MB/s link to a switched hub.

   > I'd be
   > more than happy to have somebody else implement the CObject interface,
   > but I just don't see the time it would take (including getting myself
   > up to speed on the vagaries of dynamic linking) is worthwhile.  If you
   > (Jim Fulton) want to try to add this interface, I'd be happy to help.

   I'll take a crack at this when you release 0.3.

Great!

   >    >
   >    > Use "PyArray_" as the name of the Matrix Object.  This is a simple
   >    > renaming of the existing "PyMatrix_".
   >    >
   >    > Use "array(sequence, typecode='d')" as the default
   >    > constructor for this new C type.
   >
   >    Is this a replacement for the existing array type?
   >
   > I decided it was not worth the huge set of compromises that would have
   > been necessary to make the matrix/array object truly compatible with
   > the existing array object.  Still, the right name for this object
   > really is array.
   >
   > There's no problem with using PyArray as the C name for the object
   > because the existing array object does not export an interface.  Also,
   > there's no problem with using the name "array" as a constructor
   > because avoiding these sorts of naming conflicts is why python has a
   > module system in the first place.  No existing code that imports
   > "arraymodule" will be broken, but hopefully people in the future will
   > start using the new multiarraymodule for the same tasks.

   But existing imports of array on some systems may be broken.  Even though the
   array module is stored in arraymodule._, it is imported with "import array".

You're talking about something else here.  In my setup you'd do something like:

from multiarray import array

this would not conflict in any way with the existing arraymodule.  Now
arguments about Array.py on systems with case-insensitive filesystems
are a different matter.

   >    > In order to support these python objects (and others like them), two
   >    > special data members will be added, "__array__", and "__object__".  If
   >    > an object has the member "__array__", then the C functions that handle
   >    > matrices will attempt to retrieve the matrix from this member when
   >    > passed in a python object.
   >
   >    Are we taking about python members or C structure members?  Is the
   >    __array__ member supposed to be the C pointer to a block of memory?
   >
   > The __array__ member is a member of a python object which is expected
   > to contain a python object of type array (the type created in C that
   > this whole thing is based on).
   >
   >    > In addition, they will attempt to convert
   >    > their result to an object of class "__object__" upon return.
   >
   >    Class __object__?  So __object__ is a pointer to a Python class
   >    object?
   >
   > This is still a python member.  In python what it would do is call
   > m.__object__(new_array).  I assume that a similar thing can be done in C
   > (I haven't implemented this in C yet).

   And new_array is one of the new built-in array objects?  Exactly.

   >    > This
   >    > means that umath.sin(Array([0, pi/2, pi])) == Array([0.,1.,0.]).
   >
   >    OK.  This makes sense
   >
   > Remember that Array is a python object here, that's the trick I'm
   > trying to make work out.

   So all of this is really about being able to derive Python classes from
   built-in types?  That is, you want an Array (which is an instance of a Python
   class) to store it's data in an array (which is an object of type PyArrayType),
   and you want functions that you pass an Array to to get at it's array.  Have I
   got this right?

Yep. In addition, I want these functions to return an Array instead of
an array.  (Even I'm beggining to doubt the value of this case-based
differentiation of names by now).

   (BTW, you should export the actual type objects.)

Of course, but this still doesn't get me the behavior I want.

   >    > Hopefully, this convention will allow these python objects to coexist
   >    > well with any numeric libraries.
   >
   >    Could you provide some additional details?
   >
   > Here's a bit of code for a unary function expecting a single PyArray
   > argument of type "double" of two dimensions:
   >
   > 	PyObject *op;
   > 	PyArrayObject *ap, *rp;
   >
   > 	TRY(PyArg_ParseTuple(args, "O", &op));
   > 	TRY(ap = PyArray_ContiguousFromObject(op, PyArray_DOUBLE, 2, 2));
   >
   > 	// Do something with ap to get rp
   >
   > 	Py_DECREF(ap);
   >
   > 	return PyArray_Return(rp, op);
   >
   > With the exception of the second argument to PyArray_Return, this is
   > the current way of writing such a chunk of code.
   >
   > PyArray_ContiguousFromObject will convert any python sequence type to
   > an array of the appropriate type and dimensions if possible.  If the
   > argument is already an array of the appropriate type and dimensions,
   > then that array will be increfed and returned (unless its data points
   > to a discontiguous chunk of memory in which case it will be copied
   > into a new array with contiguous memory).
   >
   > The new feature that I want to add to this function is that if its
   > argument is a python object with the attribute "__array__", then this
   > function wil return the PyArrayObject contained in that attribute (if
   > this is indeed the case).

   OK.  If my statement above is right, then I understand this.

   > PyArray_Return is used because some operations wind up producing a
   > 0-dimensional array.  These will be converted to the appropriate
   > python scalars on return.
   >
   > The new feature that I want to add here is that if the second argument
   > has a "__class__" attribute, then the constructor for that class will

   You mean __object__? (I like __class__, or maybe even
   __return_constructor__ better.)  

I guess I still haven't finalized the name of this yet.  I like
__return_constructor__, so I'll stick with that for now.

   > be used to return a new python object with the returned PyArrayObject
   > in its "__array__" attribute.

   So PyArray_Return checks to see of op has a callable __object__ member and if
   it does, returns the result of calling this member with rp as an argument.
   Right?

Right, except of course now it checks for a __return_constructor__ member.

   > This is the simplest method I could come up with to get my
   > "sin(Array())" example to work.

   Whew.  I need to think about this.  I'm not faulting your approach, but it
   feels a bit complicated.

   What if you had a function with multiple arguments and you wanted the
   returned object to have the same type as the arguments?  For example,
   what if you wanted

     spam(some_Array, some_other_Array) to return an Array and
     spam(some_Matric, some_other_Matrix) to return a Matrix?

   Would you use the first argument's __object__ or the second's?

If they had different __return_constructor__'s, then I'd raise an
exception.  If they were the same, or only one of them had one, then
I'd use the one that was present.

   I'll probably have more to say about this after I take some time to mull it
   over.

Please say more about this.  I'm just trying to come up with a way to
make it reasonable to subclass the array/Array object for purposes
like creating a Matrix object with as little pain as possible.

-Jim



=================
MATRIX-SIG  - SIG on Matrix Math for Python

send messages to: matrix-sig@python.org
administrivia to: matrix-sig-request@python.org
=================