Re: [Numpy-discussion] Matrix Class [was numpy release]

Chris.Barker wrote:
Alan G Isaac wrote:
the cost of complexity should be justified by a gain in functionality.
I don't think functionality is the right word here. the Matrix class(es) is all about clean, convenient API, i.e. style, not functionality -- we have all the functionality already, indeed we have it with plain old arrays, so I think that's really beside the point.
Not entirely, there's no good way do deal with arrays of matrices at present. This could be fixed by tweaking dot, but it could also be part of a reform of the matrix class. [CHOP]
Timothy Hochberg wrote:
1. The matrices and arrays should become more alike if possible
I'm not sure I agree -- the more alike they are, the less point there is to having them.
That's the best possible outcome. If some solution can be reached that naturally supports enough matrix operations on array, without significantly complexifying array, to satisfy the matrix users then that would be great. Less stuff to maintain, less stuff to learn, etc, etc. With that in mind, what is minimum amount of stuff that matrix should support: 1. Indexing along a single axis should produce a row or column vector as appropriate. 2. '*' should mean matrix multiplication. 3. '**' should mean matrix exponentiation. I suspect that this is less crucial than 2, but no more difficult. There's some other stuff that's less critical IMO (.H and .I for example) but feel free to yell at me if you think I'm mistaken. There's some other properties that a fix should have as well, in my opinion and in some cases in others opinions as well. 1. A single index into an array should yield a 1D object just as indexing an array does. This does not have to inconsistent with #1 above; Chris Barker proposed one solution. I'm not sold on the details of that solution, but I approve of the direction that it points to. [In general A[i][j] should equal A[i,j]. I know that fancy indexing breaks this; that was a mistake IMO, but that's a discussion for another day]. 2. It should be possible to embed both vectors and matrices naturally into arrays. This would allow natural and efficient implementation of, for example, rotating a 1000 3D vectors. One could spell that R * V, where R is a 2x2 matrix and V is a 1000x3 array, where the last axis is understood to be a vector. 3. I'm pretty certain there's a third thing I wanted to mention but it escapes me. It'll resurface at the most inopportune time.... Let's play with Chris Barker's RowVector and ColVector proposal for a moment. Let's suppose that we have four four classes: Scalar, RowVector, ColVector and Matrix. They're all pretty much the same as today's array class except that: 1. They are displayed a little differently so that you can tell them apart. 2. __mul__ and __pow__ are treated differently Let's consider __mul__: when a RowVector multiplied with ColVector, the dot product is computed. If, for example, the arrays are both 2D, the they are treated as arrays of vectors and the dot product of each pair of vectors is computed in turn: I think broadcasting should work correctly, but I haven't thought that all the way through yet. Ignoring broadcasting for a moment, the rules are: 1. (n)D Scalar * (n)D Scalar => (n)D Scalar (elementwise) 2. (n)D RowVector * (n)D ColVector => (n-1)D Scalar (dot product) 3. (n+1)D Matrix * (n)D ColVector => (n)D ColVector (matrix-vector product) 4. (n)D Matrix * n(D) Matrix => (n)D Matrix (matrix) product Other combinations would be an error. In principal you could do dyadic products, but I suspect we'd be better off using a separate function for that since most of the times that would just indicate a mistake. Note that in this example Scalar is playing the role of the present day array, which I've assumed has magically vanished from the scene somehow. This looks like it gets of most the way towards where we want to be. There are some issues though. For example, all of the sudden you want to different transpose operators; one that transposes matrices and flips colVectors to RowVectors and leaves Scalars alone, and another that manipulates the rest of the array structure. There is probably other stuff like that too, it doesn't look insurmountable to me right now, however. Still, I'm not sold on the whole RowVector/ColVector/Matrix approach. I have a gut feeling that we'd be better off with a single array class that was somehow tagged to indicate it's type. Also, I keep hoping to come up with an elegant generalization to this that turns arrays into quasi-tensor objects where all the matrix behavior falls out naturally. Sadly, attempts in this direction keep collapsing under there own weight but I'm still thinking about it. However, I do think that the RowVector/ColVector/Matrix approach is a step in the right direction and is certainly worth discussing to see where it leads. -tim
should share more of the same code base.
Don't they share almost all the same code already? My understanding is that they are arrays with a couple operators overloaded -- how much more code could they share?
Nevertheless, I'm interested to see what Tim comes up with.
2. This doesn't support the concept of arrays of matrices/vectors, which is one thing I've always wanted out of the matrix class. For example it would be nice to be able to spell: 'A' is a 1D array of matrices (3D) overall, 'B' is a 1D array of vectors (3D) overall,
You could do this now with a 1-d object array, with a bunch of matrices in it.
matrix multiply them together, yielding a 1D array of matrices (3D) overall.
But this, or course, would fail. However, if we expand this scenario to A and B being 1-d arrays of other arrays that may not all be the same size (and thus you couldn't just use a 3-d array), and be able to run various ufuncs on them, that would be cool!
Bill Spotz wrote:
The fact that there is another way to get a row vector: M[i] is a bonus.
Except that the behavior of M[i] is one of the driving issues of the conversation.
well, yes, but a bit of a red herring, I think -- it's syntactic sugar. The issue, I think, is that even if you do:
r = M[:,i]
you can't then index that with a single index -- it's a matrix, NOT a 1-d array. Again, I proposed disallowing M[i] altogether, as it is ambiguous, but that really is gratuitous consistency.
A vector is a 1-d array with some functionality overloaded to make it convenient to do linear algebra.
Again, I would argue for Vectors inheriting from Matrix. I would make the case based on code reuse and elegance, although these might be overridden by some other concern.
I think that's a bit of an implementation detail -- we need to define behavior that we want, and decide how best to implement that. i.e.:
RowVector[i] -> scalar RowVector.A -> 1-d array (same for column)
How do we want it pretty-printed?
We need to focus on column vectors here -- It's easy to say: a row vector is a 1-d array -- that's easy and clean. It's column vectors that are tricky -- indeed they are a pain with regular old arrays.
What about
np.Matrix(<container of RowVector>) np.Matrix(<container of ColumnVector>)
I'd say you get a matrix from each of those, but they would be transposes of each-other -- another good reason for the Vector classes. though maybe those should be spelled:
np.row_stack and np.column_stack
I have generally thought about this in the context of, say, a Krylov-space iterative method, and what that type of interface would lead to the most readable code.
Can you whip up a small example, starting with the current implementation?
Bruce Southey wrote:
Hi, I would like to use the matrix functionality but I, like others, get by without it.
What does it lack that keeps you from using it? That's the key question?
+1 to Tim's and Nadav's comments. As Tim said, there should be seamless integration between concepts of vectors and matrices - after all there really no distinction between them in linear algebra.
This is all about being able to index a "vector" with a single index -- that's it, I think. Matlab handles this by special casing matrices that have a single dimension of one. It also has an easy way to spell a literal for column vector, though I suppose:
np.Matrix((1,2,3,4,5)).T
isn't so bad.
-2 for having rowvector and columnvector - both numpy and I should know the orientation, if not, I deserve garbage or an error.
But HOW can you know the orientation? a 1-d array has no orientation -- which is why the current version always returns a matrix when indexing.
0 for vector - I don't see the need for it as a unique entity but understand the simplicity of saying vector.
I don't think we can have vector without ColumnVector, though I suppose a "Vector" can be either a (n,1) or a (1,n) matrix that allows single indexing.
By the way, maybe a column vector could be handy for doing array broadcasting, as an cleaner way to do:
x = np.arange(10) y = np.arange(20).reshape((-1,1))
z = FunctionOfXandY(x,y)
Final note:
If no one has an linear algebra examples, then we're wasting out time with even this cheap talk.
-Chris
-- Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
-- . __ . |-\ . . tim.hochberg@ieee.org

On Apr 24, 2008, at 5:45 PM, Timothy Hochberg wrote:
Bill Spotz wrote:
I have generally thought about this in the context of, say, a Krylov-space iterative method, and what that type of interface would lead to the most readable code.
Can you whip up a small example, starting with the current implementation?
This sounds like work. ;-) I'll shoot for putting something together this weekend. I think Tim is right; we've been talking around in circles, and we need something concrete. I also think that a conjugate gradient algorithm, say, is high-level and doesn't require iterating over rows (or columns). We should also look at (at least) one other algorithm, one that iterates over rows. I would suggest Gaussian elimination, unless this is too obviously a high-level functionality that should be part of the extension module. ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370 Email: wfspotz@sandia.gov **

On Apr 24, 2008, at 8:52 PM, Bill Spotz wrote:
On Apr 24, 2008, at 5:45 PM, Timothy Hochberg wrote:
Bill Spotz wrote:
I have generally thought about this in the context of, say, a Krylov-space iterative method, and what that type of interface would lead to the most readable code.
Can you whip up a small example, starting with the current implementation?
This sounds like work. ;-)
I'll shoot for putting something together this weekend. I think Tim is right; we've been talking around in circles, and we need something concrete.
I have posted my example (using the current interface) at http://www.scipy.org/ConjugateGradientExample . I have also added a link to it from Alan's http://www.scipy.org/MatrixIndexing so that they will be somewhat connected. It is just one example, and provides one small area where the vector classes would be useful. I hope others will post similar examples, so that the design discussion can address concrete, agreed-upon situations. Note that I started using "row_vector" and "col_vector" as the class names in my discussion. No reason to re-introduce camel-case to numpy, and I think "col" is a universally recognized abbreviation for "column" and I like the symmetry of the three-letter "row_" and "col_" prefixes. ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370 Email: wfspotz@sandia.gov **

On Mon, 28 Apr 2008, Bill Spotz apparently wrote:
http://www.scipy.org/ConjugateGradientExample ... provides one small area where the vector classes would be useful.
Maybe not. I posted an alternate version of your algorithm, just below yours, sticking very close to your example. Cheers, Alan

On Apr 28, 2008, at 10:15 PM, Alan G Isaac wrote:
On Mon, 28 Apr 2008, Bill Spotz apparently wrote:
http://www.scipy.org/ConjugateGradientExample ... provides one small area where the vector classes would be useful.
Maybe not. I posted an alternate version of your algorithm, just below yours, sticking very close to your example.
Alan, Nice. To quote: "The second version of the algorithm provided above suggests that little if anything was gained by the use of matrices." If matrix multiplication in my example is replaced with np.dot() in yours, then when IS anything gained by using matrices? As for this example, my version should work with a properly implemented sparse_matrix A, but the array approach precludes that. That is to say, I could convert A to a matrix if it is provided as an array, but you could not convert a sparse_matrix to an array. ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370 Email: wfspotz@sandia.gov **

On Apr 28, 2008, at 10:47 PM, Bill Spotz wrote:
As for this example, my version should work with a properly implemented sparse_matrix A, but the array approach precludes that. That is to say, I could convert A to a matrix if it is provided as an array, but you could not convert a sparse_matrix to an array.
I may be jumping to conclusions here. We could conceivably implement a sparse_array class upon which the sparse_matrix class is based. ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370 Email: wfspotz@sandia.gov **

Bill Spotz wrote:
On Apr 28, 2008, at 10:47 PM, Bill Spotz wrote:
As for this example, my version should work with a properly implemented sparse_matrix A, but the array approach precludes that. That is to say, I could convert A to a matrix if it is provided as an array, but you could not convert a sparse_matrix to an array.
I may be jumping to conclusions here. We could conceivably implement a sparse_array class upon which the sparse_matrix class is based.
FYI: In scipy, the iterative solvers (cg and like) should work with LinearOperator (a class with matvec, rmatvec, and eventually matmat methods) passed instead of the A matrix argument. There is a function aslinearoperator() is scipy.sparse.linalg, that constructs a LinearOperator instance from a numpy array, matrix and scipy sparse matrix. It could be generalized to accept a function for a matrix-free computation, too. Also LinearOperator.__mul__, __rmul__ could be easily defined via matvec, rmatvec. IMHO the all iterative solvers could use such a concept, to shield them from what the system matrix A or the preconditioner actually are and to be able writing them in a visually appealing way (i.e. close to what one writes on a paper). regards, r.

On Apr 29, 2008, at 2:50 AM, Robert Cimrman wrote:
FYI: In scipy, the iterative solvers (cg and like) should work with LinearOperator (a class with matvec, rmatvec, and eventually matmat methods) passed instead of the A matrix argument. There is a function aslinearoperator() is scipy.sparse.linalg, that constructs a LinearOperator instance from a numpy array, matrix and scipy sparse matrix. It could be generalized to accept a function for a matrix-free computation, too. Also LinearOperator.__mul__, __rmul__ could be easily defined via matvec, rmatvec.
IMHO the all iterative solvers could use such a concept, to shield them from what the system matrix A or the preconditioner actually are and to be able writing them in a visually appealing way (i.e. close to what one writes on a paper).
I agree. This is very close to what we use in Sandia's C++ Trilinos project: http://trilinos.sandia.gov. The linear algebra services package (Epetra: http://trilinos.sandia.gov/packages/epetra) supports the following class hierarchies: Epetra_Operator <- Epetra_RowMatrix <- Epetra_CrsMatrix Epetra_MultiVector <- Epetra_Vector (Where "Crs" stands for "compressed row storage"). A typical solver package (e.g. AztecOO: http://trilinos.sandia.gov/packages/aztecoo) defines solver routines that take an Epetra_Operator for A (and preconditioner P) and an Epetra_MultiVector for b and x. The most common way to call them is with an Epetra_CrsMatrix and an Epetra_Vector, but the flexibility is there. ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370 Email: wfspotz@sandia.gov **

On Mon, 28 Apr 2008, Bill Spotz apparently wrote:
If matrix multiplication in my example is replaced with np.dot() in yours, then when IS anything gained by using matrices?
When matrix algebra is clearer than array algebra. But that is not the case for this algorithm. (Just the opposite, I would say.) Simple example: (X.T * X).I * (X.T * Y) (Not that this is computationally nice ...)
As for this example, my version should work with a properly implemented sparse_matrix A, but the array approach precludes that. That is to say, I could convert A to a matrix if it is provided as an array, but you could not convert a sparse_matrix to an array.
A.todense().A? (I don't think you can save on memory use without substantial changes...) In an earlier discussion, I suggested that if iteration over matrices were to yield 1d arrays, then iteration over sparse matrices should yield 1d "sparse arrays". Cheers, Alan

On Thu, Apr 24, 2008 at 5:45 PM, Timothy Hochberg <tim.hochberg@ieee.org> wrote:
Chris.Barker wrote:
Alan G Isaac wrote:
the cost of complexity should be justified by a gain in functionality.
I don't think functionality is the right word here. the Matrix class(es) is all about clean, convenient API, i.e. style, not functionality -- we have all the functionality already, indeed we have it with plain old arrays, so I think that's really beside the point.
Not entirely, there's no good way do deal with arrays of matrices at present. This could be fixed by tweaking dot, but it could also be part of a reform of the matrix class.
[CHOP]
Timothy Hochberg wrote:
1. The matrices and arrays should become more alike if possible
I'm not sure I agree -- the more alike they are, the less point there is to having them.
That's the best possible outcome. If some solution can be reached that naturally supports enough matrix operations on array, without significantly complexifying array, to satisfy the matrix users then that would be great. Less stuff to maintain, less stuff to learn, etc, etc.
With that in mind, what is minimum amount of stuff that matrix should support:
1. Indexing along a single axis should produce a row or column vector as appropriate. 2. '*' should mean matrix multiplication. 3. '**' should mean matrix exponentiation. I suspect that this is less crucial than 2, but no more difficult.
There's some other stuff that's less critical IMO (.H and .I for example) but feel free to yell at me if you think I'm mistaken.
There's some other properties that a fix should have as well, in my opinion and in some cases in others opinions as well.
1. A single index into an array should yield a 1D object just as indexing an array does. This does not have to inconsistent with #1 above; Chris Barker proposed one solution. I'm not sold on the details of that solution, but I approve of the direction that it points to. [In general A[i][j] should equal A[i,j]. I know that fancy indexing breaks this; that was a mistake IMO, but that's a discussion for another day]. 2. It should be possible to embed both vectors and matrices naturally into arrays. This would allow natural and efficient implementation of, for example, rotating a 1000 3D vectors. One could spell that R * V, where R is a 2x2 matrix and V is a 1000x3 array, where the last axis is understood to be a vector.
R is 3x3? This works already, whats tough is an array of rotation matrices applies "element wise" to an array of vectors.
1. 2. I'm pretty certain there's a third thing I wanted to mention but it escapes me. It'll resurface at the most inopportune time....
Let's play with Chris Barker's RowVector and ColVector proposal for a moment. Let's suppose that we have four four classes: Scalar, RowVector, ColVector and Matrix. They're all pretty much the same as today's array class except that:
1. They are displayed a little differently so that you can tell them apart. 2. __mul__ and __pow__ are treated differently
Let's consider __mul__: when a RowVector multiplied with ColVector, the dot product is computed. If, for example, the arrays are both 2D, the they are treated as arrays of vectors and the dot product of each pair of vectors is computed in turn: I think broadcasting should work correctly, but I haven't thought that all the way through yet. Ignoring broadcasting for a moment, the rules are:
1. (n)D Scalar * (n)D Scalar => (n)D Scalar (elementwise) 2. (n)D RowVector * (n)D ColVector => (n-1)D Scalar (dot product) 3. (n+1)D Matrix * (n)D ColVector => (n)D ColVector (matrix-vector product) 4. (n)D Matrix * n(D) Matrix => (n)D Matrix (matrix) product
And ColVector * RowVector is a matrix? Or illegal. The latter, I suppose, if * is contraction on indices.
Other combinations would be an error. In principal you could do dyadic products, but I suspect we'd be better off using a separate function for that since most of the times that would just indicate a mistake. Note that in this example Scalar is playing the role of the present day array, which I've assumed has magically vanished from the scene somehow.
This looks like it gets of most the way towards where we want to be. There are some issues though. For example, all of the sudden you want to different transpose operators; one that transposes matrices and flips colVectors to RowVectors and leaves Scalars alone, and another that manipulates the rest of the array structure. There is probably other stuff like that too, it doesn't look insurmountable to me right now, however.
Still, I'm not sold on the whole RowVector/ColVector/Matrix approach. I have a gut feeling that we'd be better off with a single array class that was somehow tagged to indicate it's type. Also, I keep hoping to come up with an elegant generalization to this that turns arrays into quasi-tensor objects where all the matrix behavior falls out naturally. Sadly, attempts in this direction keep collapsing under there own weight but I'm still thinking about it.
I think a start could be made by adding a signature to arrays, indicating whether the corresponding indices are up or down. Up is column, down is row. The in tensor parlanc, matrices have signature (u,d). The matrix product sums an up with a down, contraction, so you get the usual rules. Note that row*col == col*row == inner product in this scheme, which may seem a bit odd, but transpose(row) := col as usual. This doesn't solve the problem of containers of matrices, however. I think that matrices could be treated as objects with special, efficient storage in arrays, and that way the element-wise operators will do what you need using broadcasting. Essentially the last two/one dimensions are reserved. This is a bit like matlab cell arrays, except the arrays must be homogeneous in the matrix type. Note that I'm making a distinction between 1xn matrices and row vectors. The former has signature (u,d), the latter (u,). Chuck
participants (5)
-
Alan G Isaac
-
Bill Spotz
-
Charles R Harris
-
Robert Cimrman
-
Timothy Hochberg