[Numpy-discussion] untenable matrix behavior in SVN
Christopher Barker
Chris.Barker at noaa.gov
Fri Apr 25 15:32:44 EDT 2008
Alan G Isaac wrote:
> Please return 1d arrays in response to scalar
> indexing as the provisional arrangement.
+1 (for the provisional solution)
This has clarified it a bit for me. The current situation allows one to
create "row vectors" and "column vectors" by generating matrices (in
various ways) that happen to be shape (1,n) or (n,1). These objects
behave as we all want in arithmetic operations (*, **). So what's missing:
*****
The ability to index one of these "vector" objects by a single index,
and get a scalar as a result.
*****
If I'm not mistaken (and I probably am) -- that is the crux of this
entire conversation.
Do we want that? I would say yes, it's a pretty natural and common thing
to want.
Note that with the current version of matrix (numpy 1.1.0rc1), you can
index a (n,1) matrix with a scalar, and get a (1,1) matrix, but you get
an exception if you try to index a (1,n) matrix with a scalar:
>>> rv
matrix([[3, 4, 5]])
>>> rv[1]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File
"/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/defmatrix.py",
line 228, in __getitem__
out = N.ndarray.__getitem__(self, index)
IndexError: index out of bounds
I know why that's happening, but I think it's a demonstration of why we
might want some sort of "vector" object, for a cleaner API.
How do we achieve this?
One possibility is to special case (n,1) and (1,n) matrices, but I think
there is consensus that that is a BAD IDEA (tm)
Which leaves us with needing a "vector" object -- which acts a lot like
a 1-d array, except that it acts like a (n,1) or (1,n) matrix with * and
**, and may get printed differently.
I think there are two ways to accomplish this:
(1) RowVector or ColumnVector objects that are essentially 1-d
ndarrays, with a few methods overridden.
(2) A Vector object that is a matrix of either (n,1) or (1,n) in
shape, with a couple methods overridden.
I think the difference is an implementation detail. They should behave
about the same either way.
In either case, we should probably have factory functions for these
objects: RowVector() and ColumnVector()
By the way, I think this brings us closer to the goal of matrices
"acting as much like arrays as possible" -- when you index into a 2-d
array: A[:,i], you get a 1-d array. These "vectors" would act a lot
like 1-d arrays, so that would be more similar than the current situation.
It also might be nice to have iterators over rows and columns, like:
for row in M.rows:
...
and
for col in M.columns:
...
We can get the rows as the default iterator, but it feels nice and
symmetric to be able to iterate over columns too (though I have no idea
how useful that would really be)
Stéfan van der Walt wrote:
> The whole idea of matrices was that you always
> work with 2-dimensional arrays, even when you just extract a row.
> Unless you have a proper hierarchical container (as illustrated in the
> other patch I sent), returning a 1D array breaks at least some
> expectation.
I agree -- stopgap is OK (but having people do M.A[i] is a fine stopgap
too), but if we really want this, we need some sort of "vector" object.
> If that's what the matrix users want, though, then we
> should change it.
I still want to see some linear algebra examples -- that's what matrices
are for. We can iterate over 2-d arrays and get 1-d arrays just fine
now, if that's what you want.
> Unless we agree on something, and soon, that won't happen. Your
> workaround would break x[0] == x[0,:], so we're just swapping one set
> of broken functionality for another.
Alan G Isaac wrote:
> But then we MUST eventually have x[0] != x[0,:].
I don't think so -- I think a Vector object would allow both of:
M[i,j] == M[i][j]
and
M[i] == M[i,:]
however,
M[i] != M[:,i] -- M[i] can only mean one thing, and this doesn't hold
for arrays, either.
and (if we do the iterators):
M.rows[i] == M[i,:]
M.columns[i] == M[:,i]
You can get rows a bit more conveniently than columns, which is really
just an accident of how C arranges memory, but why not? Again, to make
this "pure" we'd disallow M[i], but practicality beats purity, after all.
> how would you
> index into a vector (as in http://en.wikipedia.org/wiki/Vector_(spatial)) without it?
Right -- which is why a 1-d "vector" object has its uses.
Alan G Isaac wrote:
> Since a current behavior must disappear eventually, we
> should make it disappear as soon as possible: before the
> release. The question is how. I see two simple ways to
> move forward immediately:
>
> 1. scalar indexing of matrices returns a 1d array
> 2. scalar indexing of matrices raises a TypeError
I don't agree -- what we have is what we have, not changing it will not
break any existing code, so we should only make a change if it moves us
closer to what we want in the future -- it does seem that most folks do
want M[i] to return some sort of 1-d object, so I'm fine with (1). I'm
also fine with (2), but I'm pretty sure that one was laughed out of the
discussion a good while back!
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
More information about the NumPy-Discussion
mailing list