[Numpy-discussion] untenable matrix behavior in SVN

Fri Apr 25 15:32:44 EDT 2008

Alan G Isaac wrote:
> Please return 1d arrays in response to scalar
> indexing as the provisional arrangement.

+1 (for the provisional solution)

This has clarified it a bit for me. The current situation allows one to 
create "row vectors" and "column vectors" by generating matrices (in 
various ways) that happen to be shape (1,n) or (n,1). These objects 
behave as we all want in arithmetic operations (*, **). So what's missing:

*****
The ability to index one of these "vector" objects by a single index, 
and get a scalar as a result.
*****

If I'm not mistaken (and I probably am) -- that is the crux of this 
entire conversation.

Do we want that? I would say yes, it's a pretty natural and common thing 
to want.

Note that with the current version of matrix (numpy 1.1.0rc1), you can 
index a (n,1) matrix with a scalar, and get a (1,1) matrix, but you get 
an exception if you try to index a (1,n) matrix with a scalar:

 >>> rv
matrix([[3, 4, 5]])
 >>> rv[1]
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File 
"/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/defmatrix.py", 
line 228, in __getitem__
     out = N.ndarray.__getitem__(self, index)
IndexError: index out of bounds

I know why that's happening, but I think it's a demonstration of why we 
might want some sort of "vector" object, for a cleaner API.

How do we achieve this?

One possibility is to special case (n,1) and (1,n) matrices, but I think 
there is consensus that that is a BAD IDEA (tm)

Which leaves us with needing a "vector" object -- which acts a lot like 
a 1-d array, except that it acts like a (n,1) or (1,n) matrix with * and 
**, and may get printed differently.

I think there are two ways to accomplish this:
   (1) RowVector or ColumnVector objects that are essentially 1-d 
ndarrays, with a few methods overridden.

   (2) A Vector object that is a matrix of either (n,1) or (1,n) in 
shape, with a couple methods overridden.

I think the difference is an implementation detail. They should behave 
about the same either way.

In either case, we should probably have factory functions for these 
objects: RowVector() and ColumnVector()

By the way, I think this brings us closer to the goal of matrices 
"acting as much like arrays as possible" -- when you index into a 2-d 
array:  A[:,i], you get a 1-d array. These "vectors" would act a lot 
like 1-d arrays, so that would be more similar than the current situation.

It also might be nice to have iterators over rows and columns, like:

for row in M.rows:
   ...
and
for col in M.columns:
   ...

We can get the rows as the default iterator, but it feels nice and 
symmetric to be able to iterate over columns too (though I have no idea 
how useful that would really be)

Stéfan van der Walt wrote:
> The whole idea of matrices was that you always
> work with 2-dimensional arrays, even when you just extract a row.
> Unless you have a proper hierarchical container (as illustrated in the
> other patch I sent), returning a 1D array breaks at least some
> expectation.

I agree -- stopgap is OK (but having people do M.A[i] is a fine stopgap 
too), but if we really want this, we need some sort of "vector" object.

 > If that's what the matrix users want, though, then we
> should change it.

I still want to see some linear algebra examples -- that's what matrices 
are for. We can iterate over 2-d arrays and get 1-d arrays just fine 
now, if that's what you want.

> Unless we agree on something, and soon, that won't happen.  Your
> workaround would break x[0] == x[0,:], so we're just swapping one set
> of broken functionality for another.

Alan G Isaac wrote:
> But then we MUST eventually have x[0] != x[0,:].

I don't think so -- I think a Vector object would allow both of:

M[i,j] == M[i][j]
and
M[i] == M[i,:]

however,

M[i] != M[:,i] -- M[i] can only mean one thing, and this doesn't hold 
for arrays, either.

and (if we do the iterators):

M.rows[i] == M[i,:]
M.columns[i] == M[:,i]

You can get rows a bit more conveniently than columns, which is really 
just an accident of how C arranges memory, but why not? Again, to make 
this "pure" we'd disallow M[i], but practicality beats purity, after all.

> how would you
> index into a vector (as in http://en.wikipedia.org/wiki/Vector_(spatial)) without it?

Right -- which is why a 1-d "vector" object has its uses.

Alan G Isaac wrote:
> Since a current behavior must disappear eventually, we 
> should make it disappear as soon as possible: before the 
> release.  The question is how.  I see two simple ways to 
> move forward immediately:
> 
>         1. scalar indexing of matrices returns a 1d array
>         2. scalar indexing of matrices raises a TypeError

I don't agree -- what we have is what we have, not changing it will not 
break any existing code, so we should only make a change if it moves us 
closer to what we want in the future -- it does seem that most folks do 
want M[i] to return some sort of 1-d object, so I'm fine with (1). I'm 
also fine with (2), but I'm pretty sure that one was laughed out of the 
discussion a good while back!

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov