[Numpy-discussion] matrix indexing question
Zachary Pincus
zpincus at stanford.edu
Tue Mar 27 15:45:23 EDT 2007
Hello all,
I suspect my previous email did not contain the full chain of my
reasoning, because I thought that some parts were basically obvious.
My point was merely that given some pretty basic fundamental tenants
of numpy, Alan's suggestions quickly lead to *serious issues* far
worse than the original problem. Thanks Chris for making this more
explicit; here I will do so further.
Let A be an array of arbitrary shape, and M be m-by-n matrix.
(1) A[i] equals by definition A[i,...]
This is a pretty fundamental part of numpy, right? That property (and
its relatives) along with the broadcasting behavior, are in large
part what make numpy so very expressive compared to matlab, etc.
I can write code in numpy to do multidimensional operations over
arbitrary-shaped and arbitrary-dimensioned data that look *exactly*
like the corresponding 1D operations. It's amazing, really, how
numpy's "implicit indexing" (this property), broadcasting (sort of
the inverse of this, really, where shape tuples are extended as
necessary with implicit 1's, instead of slices being extended
with :'s as necessary), and fancy indexing, allow complex operations
to be written compactly, clearly, and in a fully-general manner.
I thought that point (1) was obvious, as it's a pretty fundamental
part of numpy (and Numeric and Numarray before that). It's in
absolutely no way a "trivial convenience". Of course, broadcasting
and ufuncs, etc, contribute a lot more to expressiveness than this
property, but it's part of the family.
(2) Thus, M[i] equals by definition M[i,:]. By the mathematical
definition of matrices and for pragmatic reasons, M[i,:] is another
matrix.
Here the trouble starts. Should matrices, not being arbitrary in
their dimension, even have the ability to be indexed as M[i]? Or
should they be completely different beasts than arrays? Which
violates the "principle of least surprise" more?
Now, I think we all more or less agree that M[i,:] has to be a
matrix, because otherwise the matrix class is pretty worthless for
linear algebra, and doesn't operate according to the mathematical
definition of a matrix. Indeed, this is one of the basic properties
of the numpy matrix -- that, and that the multiplication operator is
defined as matrix-multiplication. Remove that and there's almost no
point in *having* a separate matrix class.
Want a use-case? Do the singular value decomposition, and get the
product of the second-smallest singular vectors as Vt[:,-2] * U
[-2,:]. (Things vaguely akin to this come up a lot.) You can copy the
operation from any linear algebra text if the row- and column-vectors
are treated as above. Otherwise you have to wonder whether that's
supposed to be an outer or inner product, and use the corresponding
operator -- and at that point, why was I using matrix classes anyway?
(3) If M[i] = M[i,:], and M[i,:] is a matrix, then the iteration
behavior of matrices *has* to yield other matrices, unless Alan is
willing to suggest that list(iter(m))[i] should have a different type
than M[i] -- a clear absurdity. This is the point that I was trying
to make previously, having mistakenly assumed that point (1) was
clear to all, and that point (2) had been clearly established.
So, if we trace the reasoning of Alan's suggestion, coupled with
these above properties, we get just that:
(a) Iteration over M should yield arrays.
(b) Ergo M[i] should be an array.
(c) Ergo M[i,:] should be an array.
(d) Ergo the matrix class should be identical to the array class
except for overloaded multiplication, and the only way to get a 'row'
or 'column' vector from a matrix that operates as a proper row or
column vector should is M[i:i+1,:]. Whicy is a lot of typing to get
the 'i-th' vector from the matrix.
I don't think these changes are worth it -- it basically discards the
utility of the matrix class for linear algebra in order to make the
matrix class more useful for general purpose data storage (a role
already filled by the array type).
Here is another set of changes which would make matrices fully
consistent with their linear-algebra roots:
(a) M[i,:] should be a matrix.
(b) M[i] is an error.
(c) Iteration over M is an error.
That's kind of lame though, right? Because then matrices are
completely inconsistent with their numpy roots.
So we are left with the current behavior:
(a) M[i,:] is a matrix.
(b) M[i] is a matrix.
(c) Iteration over M yields matrices.
My view is that, fundamentally, they are linear algebra constructs.
However, I also think it would be far more harmful to remove basic
behavior common to the rest of numpy (e.g. that M[i] = M[i,:]), and
behavior that comes along with that (iterability). Hence my
suggestion: treat matrices like linear algebra constructs -- don't
iterate over them (unless you know what you're doing). Don't index
them like M[i] (unless you know what you're doing).
Maybe it's "just a little bizarre" to suggest this, but I think the
other options above are much more bizarre.
Zach
More information about the NumPy-Discussion
mailing list