[Numpy-discussion] matrix indexing question

Tue Mar 27 15:45:23 EDT 2007

Hello all,

I suspect my previous email did not contain the full chain of my  
reasoning, because I thought that some parts were basically obvious.  
My point was merely that given some pretty basic fundamental tenants  
of numpy, Alan's suggestions quickly lead to *serious issues* far  
worse than the original problem. Thanks Chris for making this more  
explicit; here I will do so further.

Let A be an array of arbitrary shape, and M be m-by-n matrix.

(1) A[i] equals by definition A[i,...]

This is a pretty fundamental part of numpy, right? That property (and  
its relatives) along with the broadcasting behavior, are in large  
part what make numpy so very expressive compared to matlab, etc.

I can write code in numpy to do multidimensional operations over  
arbitrary-shaped and arbitrary-dimensioned data that look *exactly*  
like the corresponding 1D operations. It's amazing, really, how  
numpy's "implicit indexing" (this property), broadcasting (sort of  
the inverse of this, really, where shape tuples are extended as  
necessary with implicit 1's, instead of slices being extended  
with :'s as necessary), and fancy indexing, allow complex operations  
to be written compactly, clearly, and in a fully-general manner.

I thought that point (1) was obvious, as it's a pretty fundamental  
part of numpy (and Numeric and Numarray before that). It's in  
absolutely no way a "trivial convenience". Of course, broadcasting  
and ufuncs, etc, contribute a lot more to expressiveness than this  
property, but it's part of the family.

(2) Thus, M[i] equals by definition M[i,:]. By the mathematical  
definition of matrices and for pragmatic reasons, M[i,:] is another  
matrix.

Here the trouble starts. Should matrices, not being arbitrary in  
their dimension, even have the ability to be indexed as M[i]? Or  
should they be completely different beasts than arrays? Which  
violates the "principle of least surprise" more?

Now, I think we all more or less agree that M[i,:] has to be a  
matrix, because otherwise the matrix class is pretty worthless for  
linear algebra, and doesn't operate according to the mathematical  
definition of a matrix. Indeed, this is one of the basic properties  
of the numpy matrix -- that, and that the multiplication operator is  
defined as matrix-multiplication. Remove that and there's almost no  
point in *having* a separate matrix class.

Want a use-case? Do the singular value decomposition, and get the  
product of the second-smallest singular vectors as Vt[:,-2] * U 
[-2,:]. (Things vaguely akin to this come up a lot.) You can copy the  
operation from any linear algebra text if the row- and column-vectors  
are treated as above. Otherwise you have to wonder whether that's  
supposed to be an outer or inner product, and use the corresponding  
operator -- and at that point, why was I using matrix classes anyway?

(3) If M[i] = M[i,:], and M[i,:] is a matrix, then the iteration  
behavior of matrices *has* to yield other matrices, unless Alan is  
willing to suggest that list(iter(m))[i] should have a different type  
than M[i] -- a clear absurdity. This is the point that I was trying  
to make previously, having mistakenly assumed that point (1) was  
clear to all, and that point (2) had been clearly established.

So, if we trace the reasoning of Alan's suggestion, coupled with  
these above properties, we get just that:
(a) Iteration over M should yield arrays.
(b) Ergo M[i] should be an array.
(c) Ergo M[i,:] should be an array.
(d) Ergo the matrix class should be identical to the array class  
except for overloaded multiplication, and the only way to get a 'row'  
or 'column' vector from a matrix that operates as a proper row or  
column vector should is M[i:i+1,:]. Whicy is a lot of typing to get  
the 'i-th' vector from the matrix.

I don't think these changes are worth it -- it basically discards the  
utility of the matrix class for linear algebra in order to make the  
matrix class more useful for general purpose data storage (a role  
already filled by the array type).

Here is another set of changes which would make matrices fully  
consistent with their linear-algebra roots:
(a) M[i,:] should be a matrix.
(b) M[i] is an error.
(c) Iteration over M is an error.

That's kind of lame though, right? Because then matrices are  
completely inconsistent with their numpy roots.

So we are left with the current behavior:
(a) M[i,:] is a matrix.
(b) M[i] is a matrix.
(c) Iteration over M yields matrices.

My view is that, fundamentally, they are linear algebra constructs.  
However, I also think it would be far more harmful to remove basic  
behavior common to the rest of numpy (e.g. that M[i] = M[i,:]), and  
behavior that comes along with that (iterability). Hence my  
suggestion: treat matrices like linear algebra constructs -- don't  
iterate over them (unless you know what you're doing). Don't index  
them like M[i] (unless you know what you're doing).

Maybe it's "just a little bizarre" to suggest this, but I think the  
other options above are much more bizarre.

Zach