[PYTHON MATRIX-SIG] Let's get going

Graham Hughes graham@fishnet.net
Tue, 12 Sep 1995 00:46:50 -0700

At 07:04 PM 9/11/95 -0400, you wrote:
>   o I wanted a python built-in data type containing a data structure
>     that I could pass directly to Fortran or C routines.  I did not
>     want to have to do alot of data conversion or copying on each
>     call.  For example, if a Fortran routine returns a matrix, then
>     it should be possible to pass this matrix to another Fortran
>     routine without reallocating a data structure or converting the
>     entire matrix. 

Instant problem right there. To the best of my knowledge (and I've read this
in several books), Fortran and C use completely different ways of storing
multidimensional arrays. To wit: given the array m, like this:

        3       4       5       6

        7       8       9       0

        3       -2      4       1

There are two ways of storing this array in a continuous memory location; by
rows or by columns. By rows looks like this: '3 4 5 6 7 8 9 0 3 -2 4 1', and
is used by both C and Pascal. By columns looks like this: '3 7 3 4 8 -2 5 9
4 6 0 1', and is used by Fortran. In other words, to pass to either Fortran
or C functions you're going to need to know which type of function you have,
and then convert the internal representation.

This can be done, but it greatly complicates things.

<skip a bit>

>> "+", "-", "*", "/", "^"(as power), abs, unary"-", and unary"+" are defined  
>> as the corresponding element-wise arithmetic operations.  Because these  
>I'd rather see multiplication and division not be elementwise.  But I
>don't feel strongly about this.

Ah, but I'd prefer it be elementwise because it allows me to do wonderful
APLish things with it :)

A good reason for having multiplication/division (same thing on a
mathematical level, although much stickier anywhere else) elementwise is
that addition and subtraction already are, and it would be confusing to do
it otherwise. However, a better reason is the inherent complications with
matrix multiplications and divisions; specifically, multiplications require
that the matricies have their rows and columns match in specific ways that
are not immediately intuitive, and division is worse; not only does one need
to be square, but the square matrix must be invertible, a condition not
always fulfilled. Simple elementwise multiplication and division eliminates
these non-intutive requirements in exchange for the same requirements that
you have with addition and subtraction; both matricies must simply be of
equal dimension.

I'm not by any means suggesting that we not incorporate the matrixwise
multiplication and division, but perhaps it would be better to have these as
separate functions, both to avoid the above problems and to highlight their
non-intuitive natures (I mean, really, there *still* isn't an easy method to
invert a matrix, and the dimension requirements for matrix multiplication
probably took us all a while to get used to)...


>> Possible suggestions for the other numeric operations:
>> "%" represents matrix multiplication (or vector dot product).
>If we go with elementwise interpretation for * and /, then I think
>that all operators that make sense for floating point numbers should
>have an elementwise interpretation, so m%spam should compute a matrix
>of mods.  Perhaps one would carry this same argument to bitwise
>operators, since you could have matrices of integers.

I agree. From experience with several kinds of matricies, I'm inclined to
think that something similar to APL would be a really good idea...

Given that, we *could* implement matrix multiplication, dot products, etc.
as something similar to APL's outer and inner product functions... This
would make it more general than simply matrix multiplication and therefore
more useful, and the matrix multiplication could simply be a function that
hardcodes a certain inner product operation...

For those that haven't been exposed to APL, the inner and outer product
functions are ways of implementing element-by-element functions to
generalized tensors (although they're usually used on vectors and matricies

An outer product returns a table of results of an operation foo on all pairs
of one from matrix bar and one from matrix baz. That is, it does


for each 0 <= i < len(bar) and 0 <= j < len(baz). Naturally, it works best
in this form on vectors, but there is nothing preventing it being
implemented on matricies; mail me for additional information on how to do
this, as I need time to grovel over the manuals again so I can remember how
to do it :)

An inner product is similar; it takes two functions, foo and bar, and two
matricies, baz and quux. It performs

reduce(foo, (SUBSET1))

where SUBSET1 is 


for each 0 <= i < len(baz). The entirety can be formulated more like

reduce(foo, map(bar, baz[0][0:], quux[0][0:]))

but it is a little less clear IMHO. Plus, I'm not really sure I got those
slices right; they're supposed to be the first row...

One intriguing possiblity with using the APL stuff is it allows mixing and
matching between matrix dimensions; there is nothing that says that you
cannot perform an outer product on a matrix and a vector. This is even
occasionally useful.

(Some may ask, why the APL stuff? This is because I am convinced that APL is
doing the Right Thing when it comes to matricies. J supports an even richer
set of 'adverbs', but I have no intention of infecting Python with that kind
of complexity ;)

>If we go with elementwise interpretation of numeric operations, then
>I'm inclined to use functional, rather than matrix notation for matrix
>multiplication, dot products, and so on.

This matches with what I said *waaay* back before the inner and outer
product stuff.

>Perhaps, as a compromize, there could be a special method or operator
>that creates a reference copied version of a matrix that provides a
>different semantics.  For example, perhaps the default matrix type
>could provide standard matrix interpretation for * and /, but there
>could be a method, elementwise that would allow:
>   m.elementwise() + a
>to do elementwise arithmentic.

I suppose if you use standard matrix interpretation for * and /, but if we
don't and use elementwise stuff instead for the reasons already mentioned,
than we can completely avoid this problem...

>IMPORTANT NOTE: Mixed-mode arithmetic between matrices and scalars
>		should be defined.

Here, we can stand on math; it is similar to elementwise arithmetic, i.e. m
+ s where m is a matrix and s is a scalar adds s to every element of m. This
seems acceptable, and even intutive, to me.

>> Concatenation and multiplication sequence operations are not defined, but  
>> instead the arithmetic operations are invoked for the "+" and "*"  
>> operators. 
>Right.  Note if we don't need elementwise bitwise operators, then we
>could concatinate with "|", which I find most natural in a matrix
>  spam = foo | bar

I can see your point... Dear me, looks like what we need are some APLlike
functions... (kicks self)

I happen to like having the "|" operator do bitwise stuff, simply because
somebody's going to ask "well, what does & do?". They complement each other

>I suppose one could even repeat with >> or <<, but this is not so natural.

Yeah, plus << can give you a really *quick* multiplication by 2.

<skip a bit till I find something I disagree with...>

>> Recent newsgroup discussions have offered some convincing arguments for  
>> treating a matrix as a mapping type where sequences of ints are used as  
>> the keys.  The sequence must be of length less than or equal to the numer  
>> of dimensions in the matrix.  Each element of the sequence is either an  
>> integer or a sequence.  If it is an integer, it returns the corresponding  
>> element for that dimension, if it is a sequence then it returns all of the  
>> elements given in the sequence.
>> ie. A[0] -> 1, A[((1,2))] -> [2,3]
>> M[0] -> [1,2,3,4,5]
>> M[(0,3)] -> 4
>> M[((0,2),0)] -> [1,21]
>> M[(range(1,3),range(2,4))] -> [[13,14],[23,24]]
>A way of visualizing this is as multi-dimensional slices that let you,
>for example, take a rectangular slice out of a 2-d matrix.

Definitely useful; the most intutive (although not necessarily the fastest)
way of calculating a determinant is to use multi-dimensional slices; I was
stymied by the 'normal' Python list-of-lists implementation on this note
when trying to implement a matrix inverter.


>> **Instantiation**
>> The following function will be used to instantiate a matrix:
>> Matrix(datasource(optional), d1, d2, ..., typecode='d')
>> The typecodes are as follows:
>> c - char (non-numeric)
>> 1 - unsigned char
>> b - signed char
>> h - short
>> i - int
>> l - long
>> f - float
>> d - double
>> F - complex float
>> D - complex double

I'm not fond of this, as it breaks normal Python typing; let the matrix
figure out what's there. Matter of fact, might be a good idea to make the
matrix a general container like lists are now, so you can store, say, a
tuple representing a complex number in there? As it stands, with these
restrictions you can forget it. However, this makes it slightly more work
for those wonderful Fortran library routines.


>> **Coercion**
>> Dimension coercion:The following is a proposal to make the matrix object  
>> behave as closely as possible to matrices in matlab (to make it easy to  
>> convert matlab code to python), and to generalize this behavior to  
>> arbitrarily high dimensional objects.
>> If a matrix of fewer dimensions than that required is provided, then the  
>> matrix can be converted to more dimensions by making copies of itself.   
>> For purposes of dimension coercion, an int or a float is considered a 0-d  
>> matrix.
>> ie. A+1 -> [2,3,4], A+C -> [[2,6,12],[17,27,39]]
>> 	B^2 -> [121,144,169] vs. B^A -> [11, 144, 2197]
>I agree wrt scalars.

Interesting... I like it. So what happens if I convert a matrix of [1,2,3]
into a 4x2 ([[x,x,x,x],[x,x,x,x]])? Do we get it wrapping around like

>> Simliar coercion should be performed for most functions acting on  
>> matrices.  Helper functions will be provided to make this easy, however  
>> there is no way to enforce this.  The following is a proposed standard to  
>> allow functions to be applied to higher-dimensional objects in a uniform  
>> way.
>> If a function is defined to operate on scalars (say sqrt()), then when  
>> applied to a matrix it should apply itself to every element in the matrix.
>> ie. sqrt(C) -> [[1,2,3],[4,5,6]]
>> This is generalized to higher dimensions so that if a function is provided  
>> with a matrix of more dimensions that it requires, that function will  
>> apply itself to each appropriately sized unit in the matrix.
>> ie. sum(A) -> 6 whereas sum(C) -> [14,77]

Definitely. Back to the elementwise operations.

>I am a bit concerned that this will lead to hard to find bugs.  This
>also put's extra burden on the function implementor.  I realize (since
>you explained it to me in a separate note :) your concern that this is
>needed for efficiency, otherwise the map operator, or perhaps a
>specialized matrix map operator could be used.

I see your point. It might be better to use that matrix map operator... Best
of both worlds, I guess. Oh, for APL's parallelism ;) That might not be a
bad idea, but I think it would be a nasty bit of work for the interpreter
and underlying source code.

>> Type coercion: No type coercion will be performed.  However, it is  
>> possible to have a function defined to work on multiple types (operator  
>> overloading).  So sqrt might be defined on matrices of both floats and  
>> doubles in which case it will work properly when given either matrix as  
>> input, but will return a type error if applied to a matrix of ints.

Well, if we simply let the caller figure out what it's got in it, like what
I suggested above, this would be unnecessary. Let sqrt() (or indirectly,
Python) figure out whether it's a double, int, or a complex.


>> byteswap - Useful for reading data from a file written on a machine with a  
>> different byte order
>Hm. Does this operate in-place or return a value?
>You know, there really *should* be a matrix map operator.  This should
>function like the standard map operator, but it will generate a matrix
>and it will probably want to be told at what level/dimension to apply
>the mapping function.  In fact, this would give you arbitrary
>elementwise operations given scalar functions.

Or, we can always use inner/outer products... :)

>> typecast(t) - Returns a new matrix of the same dimensions where each item  
>> is cast (using C typecasting rules) from the matrix's type to the new type  
>> t.
>Our constructor already gives you this:
>  new_matrix=Matrix(old_matrix,new_type)

Unnecessary if the matrix is a container.


>> concat(l) - l must be a list of matrices, and each matrix in l must have  
>> the same number of dimensions, as well as the same size for every  
>> dimension except for the first.  This will return a new matrix M whose  
>> first dimension is the sum of the first dimensions of all the matrices in  
>> the list, and whose data is filled from the data in all of these matrices.
>> ie. concat([A,B]) -> [1,2,3,11,12,13]

Hm. Seems to run into difficulty with >1D matricies. Ideally, concat should take

        2       3       4               5       6       7
        5       6       7               8       9       10

and return

        2       3       4       5       6       7

        5       6       7       8       9       10

But the given behaviour of concat depends greatly on whether [[x],[y]] is
stored by rows or by columns, ie. is it like this:

        [       |       ,       |       ]
                V               V
               [x]             [y]

or like this:

        ->      [x]
        ->      [y]

This is a conceptual difficulty.

>> This is different from Matrix([A,B]) -> [[1,2,3],[11,12,13]]
>> **Basic functions taking matrix arguments (numeric)**
>> It is hoped that there will be a large variety of special purpose modules  
>> to do fancy numeric operations on matrices.  There should also be a basic  
>> set of operations provided in the Matrix module (the module with the  
>> instantiation function).  Because type coercion is not going to be  
>> performed, it is important to decide both on the list of functions and on  
>> the basic types they should support.  The following is a very preliminary  
>> list of the functions I consider most fundamental.
>> rng(start, stop, step(optional), typecode='d') where all points in the  
>> range must be strictly less than the stop point, according to the  
>> semantics of range().  This function should possibly be added to the  
>> instantiation options.
>> ie. rng(0,0.4) = [0,0.1,0.2,0.3]; rng(0,0.1,0.41) = [0,0.1,0.2,0.3,0.4]
>Hm.  Why not:
>  Matrix.range(start=1,stop,step=1, typecode='d')
>  Matrix.range([(start=1,stop,step=1), ...], typecode='d')
>That is:
>  call it range, and give it same semantics, and
>  include a multi-dimensional range operator, mrange, that takes a
>  sequence of range specs.

Sounds good to me.


>Comments anyone?

Ditto. Anyone see something they'd like to pull apart (besides APL...)?

Graham Hughes <graham@fishnet.net>  Home page http://www.fishnet.net/~graham/
``I think it would be a good idea.'' -- Mahatma Ghandi, when asked what he
                                        thought of Western civilization
finger for PGP public key.

MATRIX-SIG  - SIG on Matrix Math for Python

send messages to: matrix-sig@python.org
administrivia to: matrix-sig-request@python.org