
By default, it looks like a 1-dim ndarray gets converted to a row vector by the matrix constructor. This seems to lead to some odd behavior such as a[1] yielding the 2nd element as an ndarray and throwing an IndexError as a matrix. Is it possible to set a flag to make the default be a column vector?
Thanks,
Jason

Jason Rennie-2 wrote:
By default, it looks like a 1-dim ndarray gets converted to a row vector by the matrix constructor. This seems to lead to some odd behavior such as a[1] yielding the 2nd element as an ndarray and throwing an IndexError as a matrix. Is it possible to set a flag to make the default be a column vector?
I'm not aware of any option that changes the construction behavior of "matrix". I honestly don't work with matrices very often - I just stick with arrays.
".T" will do what you want of course: x = matrix(range(10)).T
As for odd behavior, can you say what in particular is odd about it? For the example that you mention, let's consider it in greater depth: x=np.array(range(5)) x[1] --> 1 x=np.matrix(range(5)) x[1] --> raises IndexError "index out of bounds"
So it appears that with *matrices* as opposed to arrays, indexing with just 1 index has an implicit ":" as a 2nd index. This is distinctly different than the behavior for *arrays* where indexing with just 1 index returns a new nd-array with one fewer dimensions. I think it boils down to this: a matrix always has 2 dimensions, so indexing into it (whether with one or both indices specified) returns another 2D matrix; with arrays which are ND, indexing with just a single (integer) index returns an N-1 D array. When I look at it like this, np's indexing error is just exactly what I would expect (given that it creates a row vector).
If you are going to use matrix in numpy, you must realize that x[n] for matrix x is equivalent to x[n,:].
Maybe my reluctance to work with matrices stems from this kind of inconsistency. It seems like your code has to be all matrix, or all array - and if you mix them, you need to be very careful about which is which. It is just easier for me to work only with array, and when needing to do matrix stuff just call "dot". Come to think of it, why doesn't 1/x for matrix x invert the matrix like in MATLAB? How about x\1? (Yeah, I know I'm pushing it now :-)
As for adding an option that changes to behavior of matrix creation to be "transposed" I don't like it since the code will now do different things depending on whether the option is set. So, to write "safe" code you would have to check the option each time.

On 24-May-09, at 8:32 AM, Tom K. wrote:
Maybe my reluctance to work with matrices stems from this kind of inconsistency. It seems like your code has to be all matrix, or all array - and if you mix them, you need to be very careful about which is which.
Also, functions called on things of type matrix may not return a matrix as expected, but rather an array.
Anecdotally, it seems to me that lots of people (myself included) seem to go through a phase early in their use of NumPy where they try to use matrix(), but most seem to end up switching to using 2D arrays for all the aforementioned reasons.
David

Thanks for the responses. I did not realize that dot() would do matrix multiplication which was the main reason I was looking for a matrix-like class. Like you and Tom suggested, I think it's best to stick to arrays.
Cheers,
Jason
On Sun, May 24, 2009 at 6:45 PM, David Warde-Farley dwf@cs.toronto.eduwrote:
On 24-May-09, at 8:32 AM, Tom K. wrote:
Maybe my reluctance to work with matrices stems from this kind of inconsistency. It seems like your code has to be all matrix, or all array - and if you mix them, you need to be very careful about which is which.
Also, functions called on things of type matrix may not return a matrix as expected, but rather an array.
Anecdotally, it seems to me that lots of people (myself included) seem to go through a phase early in their use of NumPy where they try to use matrix(), but most seem to end up switching to using 2D arrays for all the aforementioned reasons.
David _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Sun, May 24, 2009 at 3:45 PM, David Warde-Farley dwf@cs.toronto.edu wrote:
Anecdotally, it seems to me that lots of people (myself included) seem to go through a phase early in their use of NumPy where they try to use matrix(), but most seem to end up switching to using 2D arrays for all the aforementioned reasons.
Maybe announcing that numpy will drop support for matrices in a future version (3.0, ...) would save a lot of pain in the long run.

On Sun, May 24, 2009 at 3:45 PM, David Warde-Farley dwf@cs.toronto.edu wrote:
Anecdotally, it seems to me that lots of people (myself included) seem to go through a phase early in their use of NumPy where they try to use matrix(), but most seem to end up switching to using 2D arrays for all the aforementioned reasons.
On 6/4/2009 11:46 AM Keith Goodman apparently wrote:
Maybe announcing that numpy will drop support for matrices in a future version (3.0, ...) would save a lot of pain in the long run.
Only if you want NumPy to be unusable when teaching basic linear algebra. I believe that would damage the speed at which NumPy use spreads.
Alan Isaac

I really don't see any advantage of matrices over arrays for teaching. I prefer to teach linear algebra with arrays. I would also like matrices to disappear from numpy. But then one would need a new implementation of scipy.sparse, which is (very unfortunately) matrix-based at the moment.
== Olivier
2009/6/4 Alan G Isaac aisaac@american.edu
On Sun, May 24, 2009 at 3:45 PM, David Warde-Farley dwf@cs.toronto.edu
wrote:
Anecdotally, it seems to me that lots of people (myself included) seem to go through a phase early in their use of NumPy where they try to use matrix(), but most seem to end up switching to using 2D arrays for all the aforementioned reasons.
On 6/4/2009 11:46 AM Keith Goodman apparently wrote:
Maybe announcing that numpy will drop support for matrices in a future version (3.0, ...) would save a lot of pain in the long run.
Only if you want NumPy to be unusable when teaching basic linear algebra. I believe that would damage the speed at which NumPy use spreads.
Alan Isaac
Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On 6/4/2009 12:08 PM Olivier Verdier apparently wrote:
I really don't see any advantage of matrices over arrays for teaching. I prefer to teach linear algebra with arrays.
beta = (X.T*X).I * X.T * Y
beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y)
I rest my case.
I would have to switch back to GAUSS for teaching if NumPy discarded matrices.
Cheers, Alan Isaac

Keith Goodman wrote:
Maybe announcing that numpy will drop support for matrices in a future version (3.0, ...) would save a lot of pain in the long run.
Or make them better. There was a pretty good discussion of this a while back on this list. We all had a lot of opinions, and there were some good ideas in that thread. However, no none stepped up to implement any of it.
I think the reason is that none of the core numpy developers use them/want them. In fact, many of those contributing to the discussion (myself included), didn't think it likely that they'd use them, even with improvements.
Someone that thinks they are important needs to step up and really make them work.
-Chris

On Jun 4, 2009, at 5:25 PM, Christopher Barker wrote:
Keith Goodman wrote:
Maybe announcing that numpy will drop support for matrices in a future version (3.0, ...) would save a lot of pain in the long run.
Or make them better. There was a pretty good discussion of this a while back on this list. We all had a lot of opinions, and there were some good ideas in that thread. However, no none stepped up to implement any of it.
I think the reason is that none of the core numpy developers use them/want them. In fact, many of those contributing to the discussion (myself included), didn't think it likely that they'd use them, even with improvements.
Someone that thinks they are important needs to step up and really make them work.
Or the core development team split the matrices out of numpy and make it as separate package that the people that use them could pick up and run with.
Cheers Tommy

On 6/4/2009 5:27 PM Tommy Grav apparently wrote:
Or the core development team split the matrices out of numpy and make it as separate package that the people that use them could pick up and run with.
This too would be a mistake, I believe. But it depends on whether a goal is to have more people use NumPy. I believe the community will gain from growth.
My core concern here is keeping NumPy very friendly for teaching. This will mean keeping a matrix object in NumPy. For this purpose, I have found the existing matrix object to be adequate. (I am teaching economics students, who generally do not have prior programming experience.)
I believe that this is crucial for "recruiting" new users, who might otherwise choose less powerful tools that appear to be more friendly on first encounter.
In sum, my argument is this: Keeping a matrix object in NumPy has substantial benefits in encouraging growth of the NumPy community, and as far as I can tell, it is imposing few costs. Therefore I think there is a very substantial burden on people who propose removing the matrix object to demonstrate just how the NumPy community will benefit from this change.
Alan Isaac

On Jun 4, 2009, at 5:41 PM, Alan G Isaac wrote:
On 6/4/2009 5:27 PM Tommy Grav apparently wrote:
Or the core development team split the matrices out of numpy and make it as separate package that the people that use them could pick up and run with.
This too would be a mistake, I believe. But it depends on whether a goal is to have more people use NumPy. I believe the community will gain from growth.
In sum, my argument is this: Keeping a matrix object in NumPy has substantial benefits in encouraging growth of the NumPy community, and as far as I can tell, it is imposing few costs. Therefore I think there is a very substantial burden on people who propose removing the matrix object to demonstrate just how the NumPy community will benefit from this change.
This is a perfectly valid argument. I am actually quite happy with the numpy package as it is (I work in astronomy), I was just pointing out that if there are few of the core numpy people interested in maintaing or upgrading the matrix class one solution might be to make it a scipy-like package that easily can be installed on top of numpy, but where the code base might be more accessible to those that are interested in matrices, but feel that numpy is a daunting beast to tackle. Some sense of ownership of a matrixpy package might encourage more people to contribute.
Just an idea ;-) Tommy

I agree. It would be a good idea to have matrices out of numpy as a standalone package. Indeed, having matrices in the numpy core comes at a pedagogical cost. Newcomers (as I once was) do not know which to use. Matrix or array? It turns out that the vast majority of numpy/scipy modules use arrays, so arrays is the preferred way to go.
It would thus be clearer to have arrays in numpy and matrices available as an external package.
Besides, I think matrices can be pretty tricky when used for teaching. For instance, you have to explain that all the operators work component-wise, except the multiplication! Another caveat is that since matrices are always 2x2, the "scalar product" of two column vectors computed as " x.T * y" will not be a scalar, but a 2x2 matrix. There is also the fact that you must cast all your vectors to column/raw matrices (as in matlab). For all these reasons, I prefer to use arrays and dot for teaching, and I have never had any complaints.
== Olivier
2009/6/4 Tommy Grav tgrav@mac.com
On Jun 4, 2009, at 5:41 PM, Alan G Isaac wrote:
On 6/4/2009 5:27 PM Tommy Grav apparently wrote:
Or the core development team split the matrices out of numpy and make it as separate package that the people that use them could pick up and run with.
This too would be a mistake, I believe. But it depends on whether a goal is to have more people use NumPy. I believe the community will gain from growth.
In sum, my argument is this: Keeping a matrix object in NumPy has substantial benefits in encouraging growth of the NumPy community, and as far as I can tell, it is imposing few costs. Therefore I think there is a very substantial burden on people who propose removing the matrix object to demonstrate just how the NumPy community will benefit from this change.
This is a perfectly valid argument. I am actually quite happy with the numpy package as it is (I work in astronomy), I was just pointing out that if there are few of the core numpy people interested in maintaing or upgrading the matrix class one solution might be to make it a scipy-like package that easily can be installed on top of numpy, but where the code base might be more accessible to those that are interested in matrices, but feel that numpy is a daunting beast to tackle. Some sense of ownership of a matrixpy package might encourage more people to contribute.
Just an idea ;-) Tommy
Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Fri, Jun 5, 2009 at 11:38 AM, Olivier Verdier zelbier@gmail.com wrote:
I agree. It would be a good idea to have matrices out of numpy as a standalone package. Indeed, having matrices in the numpy core comes at a pedagogical cost. Newcomers (as I once was) do not know which to use. Matrix or array? It turns out that the vast majority of numpy/scipy modules use arrays, so arrays is the preferred way to go. It would thus be clearer to have arrays in numpy and matrices available as an external package. Besides, I think matrices can be pretty tricky when used for teaching. For instance, you have to explain that all the operators work component-wise, except the multiplication! Another caveat is that since matrices are always 2x2, the "scalar product" of two column vectors computed as " x.T * y" will not be a scalar, but a 2x2 matrix. There is also the fact that you must cast all your vectors to column/raw matrices (as in matlab). For all these reasons, I prefer to use arrays and dot for teaching, and I have never had any complaints.
For anyone with an econometrics background in gauss or matlab, numpy arrays have a huge adjustment cost. I'm only using arrays for consistency, but my econometrics code is filled with arr[np.newaxis,:] .
I wouldn't recommend my friends switching from gauss to numpy/scipy when they want to write or develop some econometrics algorithm. gauss feels like copying the formula from the notes or a book. With numpy arrays I always have to recover one dimension, and I have to verify the shape of any array more often. In econometrics the default vector is often a 2d vector so x.T*x and x*x.T have very different meanings. python/numpy/scipy have a lot of other advantages compared to gauss or matlab, but it's not the linear algebra syntax.
So, I don't see any sense in removing matrices, you can just ignore them and tell your students to do so. It increases the initial switching cost, even if users, that get more into it, then switch to arrays.
Josef
(BTW: I think scipy.stats will soon become matrix friendly)
X = np.matrix([[1,0],[1,1]]) X.mean()
0.75
X.mean(0)
matrix([[ 1. , 0.5]])
X-X.mean(0)
matrix([[ 0. , -0.5], [ 0. , 0.5]])
Y = np.array([[1,0],[1,1]]) Y.mean(0).shape
(2,)
Y - Y.mean(0)[np.newaxis,:]
array([[ 0. , -0.5], [ 0. , 0.5]])
np.matrix([[1,0],[1,1]])**2
matrix([[1, 0], [2, 1]])
np.array([[1,0],[1,1]])**2
array([[1, 0], [1, 1]])
np.matrix([[1,0],[1,1]])**(-1)
matrix([[ 1., 0.], [-1., 1.]])
np.array([[1,0],[1,1]])**(-1)
array([[ 1, -2147483648], [ 1, 1]])
x = np.matrix([[1],[0]]) x.T*x
matrix([[1]])
(x.T*x).shape
(1, 1)
(x*x.T)
matrix([[1, 0], [0, 0]])
(x*x.T).shape
(2, 2)
y = np.array([1,0]) np.dot(y,y)
1
np.dot(y,y.T)
1
np.dot(y[:,np.newaxis], y[np.newaxis,:])
array([[1, 0], [0, 0]])

On Fri, Jun 5, 2009 at 1:33 PM, Geoffrey Ely gely@usc.edu wrote:
On Jun 5, 2009, at 10:18 AM, josef.pktd@gmail.com wrote:
I'm only using arrays for consistency, but my econometrics code is filled with arr[np.newaxis,:] .
arr[None,:] is a lot cleaner in my opinion, especially when using "numpy" as the namespace.
It took me a long time to figure out that None and np.newaxis do the same thing. I find newaxis more descriptive and mostly stick to it. When I started looking at numpy, I was always wondering what these "None"s were doing in the code.
just one more useful trick to preserve dimension, using slices instead of index avoids arr[None,:], but requires more thinking.
Josef
X
matrix([[1, 0], [1, 1]])
Y
array([[1, 0], [1, 1]])
X[0,:].shape == Y[0:1,:].shape
True
X[0,:] == Y[0:1,:]
matrix([[ True, True]], dtype=bool)
-Geoff

As someone who is very used to thinking in terms of matrices and who just went through the adjustment of translating matlab-like code to numpy, I found the current matrix module to be confusing. It's poor integration with the rest of numpy/scipy (in particular, scipy.optimize.fmin_cg) made it more difficult to use than it was worth. I'd rather have "matrix" and/or "matrix multiplication" sections of the documentation explain how to do typical, basic matrix operations with nparray, dot, T, and arr[None,:]. I think a matrix class would still be worthwhile for findability, but it should simply serve as documentation for how to do matrix stuff with nparray.
Cheers,
Jason

On 6/5/2009 11:38 AM Olivier Verdier apparently wrote:
I think matrices can be pretty tricky when used for teaching. For instance, you have to explain that all the operators work component-wise, except the multiplication! Another caveat is that since matrices are always 2x2, the "scalar product" of two column vectors computed as " x.T
- y" will not be a scalar, but a 2x2 matrix. There is also
the fact that you must cast all your vectors to column/raw matrices (as in matlab). For all these reasons, I prefer to use arrays and dot for teaching, and I have never had any complaints.
I do not understand this "argument". You should take it very seriously when someone reports to you that the matrix object is a crucial to them, e.g., as a teaching tool. Even if you do not find personally persuasive an example like http://mail.scipy.org/pipermail/numpy-discussion/2009-June/043001.html I have told you: this is important for my students. Reporting that your students do not complain about using arrays instead of matrices does not change this one bit.
Student backgrounds differ by domain of application. In economics, matrices are in *very* wide use, and multidimensional arrays get almost no use. Textbooks in econometrics (a huge and important field, even outside of economics) are full of proofs using matrix algebra. A close match to what the students see is crucial. When working with multiplication or exponentiation, matrices do what they expect, and 2d arrays do not.
One more point. As Python users we get used to installing a package here and a package there to add functionality. But this is not how most people looking for a matrix language see the world. Removing the matrix object from NumPy will raise the barrier to adoption by social scientists, and there should be a strongly persuasive reason before taking such a step.
Separately from all that, does anyone doubt that there is code that depends on the matrix object? The core objection to a past proposal for useful change was that it could break extant code. I would hope that nobody who took that position would subsequently propose removing the matrix object altogether.
Cheers, Alan Isaac
PS If x and y are "column vectors" (i.e., matrices), then x.T * y *should* be a 1×1 matrix. Since the * operator is doing matrix multiplication, this is the correct result, not an anomaly.

On Fri, Jun 5, 2009 at 11:30 AM, Alan G Isaac aisaac@american.edu wrote:
On 6/5/2009 11:38 AM Olivier Verdier apparently wrote:
I think matrices can be pretty tricky when used for teaching. For instance, you have to explain that all the operators work component-wise, except the multiplication! Another caveat is that since matrices are always 2x2, the "scalar product" of two column vectors computed as " x.T
- y" will not be a scalar, but a 2x2 matrix. There is also
the fact that you must cast all your vectors to column/raw matrices (as in matlab). For all these reasons, I prefer to use arrays and dot for teaching, and I have never had any complaints.
I do not understand this "argument". You should take it very seriously when someone reports to you that the matrix object is a crucial to them, e.g., as a teaching tool. Even if you do not find personally persuasive an example like http://mail.scipy.org/pipermail/numpy-discussion/2009-June/043001.html I have told you: this is important for my students. Reporting that your students do not complain about using arrays instead of matrices does not change this one bit.
Student backgrounds differ by domain of application. In economics, matrices are in *very* wide use, and multidimensional arrays get almost no use. Textbooks in econometrics (a huge and important field, even outside of economics) are full of proofs using matrix algebra. A close match to what the students see is crucial. When working with multiplication or exponentiation, matrices do what they expect, and 2d arrays do not.
One more point. As Python users we get used to installing a package here and a package there to add functionality. But this is not how most people looking for a matrix language see the world. Removing the matrix object from NumPy will raise the barrier to adoption by social scientists, and there should be a strongly persuasive reason before taking such a step.
Separately from all that, does anyone doubt that there is code that depends on the matrix object? The core objection to a past proposal for useful change was that it could break extant code. I would hope that nobody who took that position would subsequently propose removing the matrix object altogether.
Cheers, Alan Isaac
PS If x and y are "column vectors" (i.e., matrices), then x.T * y *should* be a 1×1 matrix. Since the * operator is doing matrix multiplication, this is the correct result, not an anomaly.
Well, one could argue that. The x.T is a member of the dual, hence maps vectors to the reals. Usually the reals aren't represented by 1x1 matrices. Just my [.02] cents.
Chuck

On 6/6/2009 12:41 AM Charles R Harris apparently wrote:
Well, one could argue that. The x.T is a member of the dual, hence maps vectors to the reals. Usually the reals aren't represented by 1x1 matrices. Just my [.02] cents.
Of course that same perspective could lead you to argue that a M×N matrix is for mapping N vectors to M vectors, not for doing matrix multiplication.
Matrix multiplication produces a matrix result **by definition**. Treating 1×1 matrices as equivalent to scalars is just a convenient anomaly in certain popular matrix-oriented languages.
Cheers, Alan Isaac

On Sat, Jun 6, 2009 at 9:29 AM, Alan G Isaac aisaac@american.edu wrote:
On 6/6/2009 12:41 AM Charles R Harris apparently wrote:
Well, one could argue that. The x.T is a member of the dual, hence maps vectors to the reals. Usually the reals aren't represented by 1x1 matrices. Just my [.02] cents.
Of course that same perspective could lead you to argue that a M×N matrix is for mapping N vectors to M vectors, not for doing matrix multiplication.
Matrix multiplication produces a matrix result **by definition**. Treating 1×1 matrices as equivalent to scalars is just a convenient anomaly in certain popular matrix-oriented languages.
So is eye(3)*(v.T*v) valid? If (v.T*v) is 1x1 you have incompatible dimensions for the multiplication, whereas if it is a scalar you can multiply eye(3) by it. The usual matrix algebra gets a bit confused here because it isn't clear about the distinction between inner products and the expression v.T*v which is typically used in it's place. I think the only consistent way around this is to treat 1x1 matrices as scalars, which I believe matlab does, but then the expression eye(3)*(v.T*v) isn't associative and we lose some checks.
I don't think we can change the current matrix class, to do so would break too much code. It would be nice to extend it with an explicit inner product, but I can't think of any simple notation for it that python would parse.
Chuck

On Sat, Jun 6, 2009 at 11:03 AM, Charles R Harrischarlesr.harris@gmail.com wrote:
I don't think we can change the current matrix class, to do so would break too much code. It would be nice to extend it with an explicit inner product, but I can't think of any simple notation for it that python would parse.
Maybe it's time to make another push on python-dev for the pep-225 stuff for other operators?
https://cirl.berkeley.edu/fperez/static/numpy-pep225/
Last year I got pretty much zero interest from python-dev on this, but they were very very busy with 3.0 on the horizon. Perhaps once they put 3.1 out would be a good time to champion this again.
It's slightly independent of the matrix class debate, but perhaps having special operators for real matrix multiplication could ease some of the bottlenecks of this discussion.
It would be great if someone could champion that discussion on python-dev though, I don't see myself finding the time for it another time around...
Cheers,
f

Fernando Perez wrote:
On Sat, Jun 6, 2009 at 11:03 AM, Charles R Harrischarlesr.harris@gmail.com wrote:
I don't think we can change the current matrix class, to do so would break too much code. It would be nice to extend it with an explicit inner product, but I can't think of any simple notation for it that python would parse.
Maybe it's time to make another push on python-dev for the pep-225 stuff for other operators?
https://cirl.berkeley.edu/fperez/static/numpy-pep225/
Last year I got pretty much zero interest from python-dev on this, but they were very very busy with 3.0 on the horizon. Perhaps once they put 3.1 out would be a good time to champion this again.
It's slightly independent of the matrix class debate, but perhaps having special operators for real matrix multiplication could ease some of the bottlenecks of this discussion.
It would be great if someone could champion that discussion on python-dev though, I don't see myself finding the time for it another time around...
How about pep 211? http://www.python.org/dev/peps/pep-0211/
PEP 211 proposes a single new operator (@) that could be used for matrix multiplication. MATLAB has elementwise versions of multiply, exponentiation, and left and right division using a preceding "." for the usual matrix versions (* ^ \ /). PEP 225 proposes "tilde" versions of + - * / % **.
While PEP 225 would allow a matrix exponentiation and right divide, I think these things are much less common than matrix multiply. Plus, I think following through with the PEP 225 implementation would create a frankenstein of a language that would be hard to read.
So, I would argue for pushing for a single new operator that can then be used to implement "dot" with a binary infix operator. We can resurrect PEP 211 or start a new PEP or whatever, the main thing is to have a proposal that makes sense. Actually, what do you all think of this: @ --> matrix multiply @@ --> matrix exponentiation and we leave it at that - let's not get too greedy and try for matrix inverse via @/ or something.
For the nd array operator, I would propose taking the last dimension of the left array and "collapsing" it with the first dimension of the right array, so shape (a0, ..., aL-1,k) @ (k, b0, ..., bM-1) --> (a0, ..., aL-1, b0, ..., bM-1) Does that make sense?
With this proposal, matrices go away and all our lives are sane again. :-) Long live the numpy ndarray! Thanks to the creators for all your hard work BTW - I love this stuff!
- Tom K.

There would be a much simpler solution than allowing a new operator. Just allow the numpy function dot to take more than two arguments. Then A*B*C in matrix notation would simply be: dot(A,B,C)
with arrays. Wouldn't that make everybody happy? Plus it does not break backward compatibility. Am I missing something?
== Olivier
2009/6/7 Tom K. tpk@kraussfamily.org
Fernando Perez wrote:
On Sat, Jun 6, 2009 at 11:03 AM, Charles R Harrischarlesr.harris@gmail.com wrote:
I don't think we can change the current matrix class, to do so would break too much code. It would be nice to extend it with an explicit inner product, but I can't think of any simple notation for it that python would parse.
Maybe it's time to make another push on python-dev for the pep-225 stuff for other operators?
https://cirl.berkeley.edu/fperez/static/numpy-pep225/
Last year I got pretty much zero interest from python-dev on this, but they were very very busy with 3.0 on the horizon. Perhaps once they put 3.1 out would be a good time to champion this again.
It's slightly independent of the matrix class debate, but perhaps having special operators for real matrix multiplication could ease some of the bottlenecks of this discussion.
It would be great if someone could champion that discussion on python-dev though, I don't see myself finding the time for it another time around...
How about pep 211? http://www.python.org/dev/peps/pep-0211/
PEP 211 proposes a single new operator (@) that could be used for matrix multiplication. MATLAB has elementwise versions of multiply, exponentiation, and left and right division using a preceding "." for the usual matrix versions (* ^ \ /). PEP 225 proposes "tilde" versions of + - * / % **.
While PEP 225 would allow a matrix exponentiation and right divide, I think these things are much less common than matrix multiply. Plus, I think following through with the PEP 225 implementation would create a frankenstein of a language that would be hard to read.
So, I would argue for pushing for a single new operator that can then be used to implement "dot" with a binary infix operator. We can resurrect PEP 211 or start a new PEP or whatever, the main thing is to have a proposal that makes sense. Actually, what do you all think of this: @ --> matrix multiply @@ --> matrix exponentiation and we leave it at that - let's not get too greedy and try for matrix inverse via @/ or something.
For the nd array operator, I would propose taking the last dimension of the left array and "collapsing" it with the first dimension of the right array, so shape (a0, ..., aL-1,k) @ (k, b0, ..., bM-1) --> (a0, ..., aL-1, b0, ..., bM-1) Does that make sense?
With this proposal, matrices go away and all our lives are sane again. :-) Long live the numpy ndarray! Thanks to the creators for all your hard work BTW - I love this stuff!
- Tom K.
-- View this message in context: http://www.nabble.com/matrix-default-to-column-vector--tp23652920p23907204.h... Sent from the Numpy-discussion mailing list archive at Nabble.com.
Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Sun, Jun 7, 2009 at 02:43, Olivier Verdier zelbier@gmail.com wrote:
There would be a much simpler solution than allowing a new operator. Just allow the numpy function dot to take more than two arguments. Then A*B*C in matrix notation would simply be: dot(A,B,C) with arrays. Wouldn't that make everybody happy? Plus it does not break backward compatibility. Am I missing something?
We've discussed it before. Search the archives. Although matrix multiplication is mathematically associative, there are performance and precision implications to the order the multiplications happen. No satisfactory implementation was found.

Yes, I found the thread you are referring to: http://mail.python.org/pipermail/python-dev/2008-July/081554.html
However, since A*B*C exists for matrices and actually computes (A*B)*C, why not do the same with dot? I.e. why not decide that dot(A,B,C) does what would A*B*C do, i.e., dot(dot(A,B),C)? The performance and precision problems are the responsability of the user, just as with the formula A*B*C. == Olivier
2009/6/7 Robert Kern robert.kern@gmail.com
On Sun, Jun 7, 2009 at 02:43, Olivier Verdier zelbier@gmail.com wrote:
There would be a much simpler solution than allowing a new operator. Just allow the numpy function dot to take more than two arguments. Then A*B*C
in
matrix notation would simply be: dot(A,B,C) with arrays. Wouldn't that make everybody happy? Plus it does not break backward compatibility. Am I missing something?
We've discussed it before. Search the archives. Although matrix multiplication is mathematically associative, there are performance and precision implications to the order the multiplications happen. No satisfactory implementation was found.
-- Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Sun, Jun 7, 2009 at 04:44, Olivier Verdier zelbier@gmail.com wrote:
Yes, I found the thread you are referring to: http://mail.python.org/pipermail/python-dev/2008-July/081554.html However, since A*B*C exists for matrices and actually computes (A*B)*C, why not do the same with dot? I.e. why not decide that dot(A,B,C) does what would A*B*C do, i.e., dot(dot(A,B),C)? The performance and precision problems are the responsability of the user, just as with the formula A*B*C.
I'm happy to make the user responsible for performance and precision problems if he has the tools to handle them. The operator gives the user the easy ability to decide the precedence with parentheses. The function does not.

Robert Kern wrote:
On Sun, Jun 7, 2009 at 04:44, Olivier Verdier zelbier@gmail.com wrote:
Yes, I found the thread you are referring to: http://mail.python.org/pipermail/python-dev/2008-July/081554.html However, since A*B*C exists for matrices and actually computes (A*B)*C, why not do the same with dot? I.e. why not decide that dot(A,B,C) does what would A*B*C do, i.e., dot(dot(A,B),C)? The performance and precision problems are the responsability of the user, just as with the formula A*B*C.
I'm happy to make the user responsible for performance and precision problems if he has the tools to handle them. The operator gives the user the easy ability to decide the precedence with parentheses. The function does not.
The function could, with suitable parsing of the argument(s):
(A*B)*C => dot( ((A,B),C) ) or dot( (A,B), C ) A*(B*C) => dot( (A, (B,C)) ) or dot( A, (B,C) )
Effectively, the comma is becoming the operator.
Eric

On Sun, Jun 7, 2009 at 04:44, Olivier Verdier zelbier@gmail.com wrote:
Yes, I found the thread you are referring to: http://mail.python.org/pipermail/python-dev/2008-July/081554.html However, since A*B*C exists for matrices and actually computes (A*B)*C, why not do the same with dot? I.e. why not decide that dot(A,B,C) does what would A*B*C do, i.e., dot(dot(A,B),C)? The performance and precision problems are the responsability of the user, just as with the formula A*B*C.
Robert Kern wrote:
I'm happy to make the user responsible for performance and precision problems if he has the tools to handle them. The operator gives the user the easy ability to decide the precedence with parentheses. The function does not.
On 6/7/2009 7:44 PM Eric Firing apparently wrote:
The function could, with suitable parsing of the argument(s): (A*B)*C => dot( ((A,B),C) ) or dot( (A,B), C ) A*(B*C) => dot( (A, (B,C)) ) or dot( A, (B,C) ) Effectively, the comma is becoming the operator.
Horribly implicit and hard to read!
If something needs to be done to make ``dot`` approach the convenience of ``*``, I think that adding ``dot`` as an array method looks attractive.
(A*B)*C => A.dot(B).dot(C) A*(B*C) => A.dot( B.dot(C) )
But no matter how you slice it, the left hand expression is more compact and easier to read.
fwiw, Alan Isaac

On Sun, Jun 7, 2009 at 18:44, Eric Firingefiring@hawaii.edu wrote:
Robert Kern wrote:
On Sun, Jun 7, 2009 at 04:44, Olivier Verdier zelbier@gmail.com wrote:
Yes, I found the thread you are referring to: http://mail.python.org/pipermail/python-dev/2008-July/081554.html However, since A*B*C exists for matrices and actually computes (A*B)*C, why not do the same with dot? I.e. why not decide that dot(A,B,C) does what would A*B*C do, i.e., dot(dot(A,B),C)? The performance and precision problems are the responsability of the user, just as with the formula A*B*C.
I'm happy to make the user responsible for performance and precision problems if he has the tools to handle them. The operator gives the user the easy ability to decide the precedence with parentheses. The function does not.
The function could, with suitable parsing of the argument(s):
(A*B)*C => dot( ((A,B),C) ) or dot( (A,B), C ) A*(B*C) => dot( (A, (B,C)) ) or dot( A, (B,C) )
Effectively, the comma is becoming the operator.
Already discussed on the old thread.

Is this lack of associativity really *always* such a huge issue? I can imagine many situations where it is not. One just want to compute A*B*C, without any particular knowing of whether A*(B*C) or (A*B)*C is best. If the user is allowed to blindly use A*B*C, I don't really see why he wouldn't be allowed to use dot(A,B,C) with the same convention... One should realize that allowing dot(A,B,C) is just *better* than the present situation where the user is forced into writing dot(dot(A,B),C) or dot(A,dot(B,C)). One does not remove any liberty from the user. He may always switch back to one of the above forms if he really knows which is best for him. So I fail to see exactly where the problem is...
== Olivier
2009/6/7 Robert Kern robert.kern@gmail.com
On Sun, Jun 7, 2009 at 04:44, Olivier Verdier zelbier@gmail.com wrote:
Yes, I found the thread you are referring to: http://mail.python.org/pipermail/python-dev/2008-July/081554.html However, since A*B*C exists for matrices and actually computes (A*B)*C,
why
not do the same with dot? I.e. why not decide that dot(A,B,C) does what would A*B*C do, i.e., dot(dot(A,B),C)? The performance and precision problems are the responsability of the
user,
just as with the formula A*B*C.
I'm happy to make the user responsible for performance and precision problems if he has the tools to handle them. The operator gives the user the easy ability to decide the precedence with parentheses. The function does not.
-- Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Olivier Verdier wrote:
One should realize that allowing dot(A,B,C) is just *better* than the present situation where the user is forced into writing dot(dot(A,B),C) or dot(A,dot(B,C)).
I'm lost now -- how is this better in any significant way?
Tom K. wrote:
But, almost all experienced users drift away from matrix toward array as they find the matrix class too limiting or strange
That's one reason, and the other is that when you are doing real work, it is very rare for the linear algebra portion to be significant. I know in my code (and this was true when I was using MATLAB too), I may have 100 lines of code, and one of them is a linear algebra expression that could be expressed nicely with matrices and infix operators. Given that the rest of the code is more natural with nd-arrays, why the heck would I want to use matrices? this drove me crazy with MATLAB -- I hated the default matrix operators, I was always typing ".*", etc.
- it seems only applicable for new users and pedagogical purposes.
and I'd take the new users of this list -- it serves no one to teach people something first, then tell them to abandon it. Which leaves the pedagogical purposes. In that case, you really need operators, slightly cleaner syntax that isn't infix really doesn't solve the pedagogical function.
It seems there is a small but significant group of folks on this list that want matrices for that reason. That group needs to settle on a solution, and then implement it. Personally, I think the row-vector, column-vector approach is the way to go -- even though, yes, these are matrices that happen to have one dimension or the other set to one, I know that when I was learning LA (and still), I thought about row and column vectors a fair bit, and indexing them in a simple way would be nice. But I don't teach, so I'll stop there.
-Chris

2009/6/8 Christopher Barker Chris.Barker@noaa.gov
Olivier Verdier wrote:
One should realize that allowing dot(A,B,C) is just *better* than the present situation where the user is forced into writing dot(dot(A,B),C) or dot(A,dot(B,C)).
I'm lost now -- how is this better in any significant way?
Well, allowing dot(A,B,C) does not remove any other possibility does it? That is what I meant by "better". It just gives the user an extra possibility. What would be wrong with that? Especially since matrix users already can write A*B*C.
I won't fight for this though. I personally don't care but I think that it would remove the last argument for matrices against arrays, namely the fact that A*B*C is easier to write than dot(dot(A,B),C). I don't understand why it would be a bad idea to implement this dot(A,B,C).
Tom K. wrote:
But, almost all experienced users drift away from matrix toward array as they find the matrix class too limiting or strange
That's one reason, and the other is that when you are doing real work, it is very rare for the linear algebra portion to be significant. I know in my code (and this was true when I was using MATLAB too), I may have 100 lines of code, and one of them is a linear algebra expression that could be expressed nicely with matrices and infix operators. Given that the rest of the code is more natural with nd-arrays, why the heck would I want to use matrices? this drove me crazy with MATLAB -- I hated the default matrix operators, I was always typing ".*", etc.
This exactly agrees with my experience too.
== Olivier

Olivier Verdier wrote:
Well, allowing dot(A,B,C) does not remove any other possibility does it? I won't fight for this though. I personally don't care but I think that it would remove the last argument for matrices against arrays, namely the fact that A*B*C is easier to write than dot(dot(A,B),C).
Well, no. Notation matters to students. Additionally, matrix exponentiation is useful. E.g., A**(N-1) finds the transitive closure of the binary relation represented by the NxN boolean matrix A.
Alan Isaac

Olivier Verdier-2 wrote:
There would be a much simpler solution than allowing a new operator. Just allow the numpy function dot to take more than two arguments. Then A*B*C in matrix notation would simply be: dot(A,B,C)
with arrays. Wouldn't that make everybody happy? Plus it does not break backward compatibility. Am I missing something?
That wouldn't make me happy because it is not the same syntax as a binary infix operator. Introducing a new operator for matrix multiply (and possibly matrix exponentiation) does not break backward compatibility - how could it, given that the python language does not yet support the new operator?
Going back to Alan Isaac's example: 1) beta = (X.T*X).I * X.T * Y 2) beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y)
With a multiple arguments to dot, 2) becomes: 3) beta = np.dot(la.inv(np.dot(X.T, X)), X.T, Y)
This is somewhat better than 2) but not as nice as 1) IMO.
Seeing 1) with @'s would take some getting used but I think we would adjust.
For ".I" I would propose that ".I" be added to nd-arrays that inverts each matrix of the last two dimensions, so for example if X is 3D then X.I is the same as np.array([inv(Xi) for Xi in X]). This is also backwards compatible. With this behavior and the one I proposed for @, by adding preceding dimensions we are allowing doing matrix algebra on collections of matrices (although it looks like we might need a new .T that just swaps the last two dimensions to really pull that off). But a ".I" attribute and its behavior needn't be bundled with whatever proposal we wish to make to the python community for a new operator of course.
Regards, Tom K.

There are two solutions to the A*B*C problem that are not quite comparable, and are not mutually exclusive either. 1) allow dot(A,B,C): this would be a great improvement over dot(dot(A,B),C), and it could virtually be done within a day. It is easy to implement, does not require a new syntax, and does not break BC
2) another solution, not incompatible with the first one, is to introduce a new operator in the python language. In the case that it be accepted by the python community at large (which is very unlikely, IMHO), be prepared to a very long time before it is actually implemented. We are talking about several years.
I think that solution 1) is much more realistic than 2) (and again, they are not mutually exclusive, so implementing 1) does not preclude for a future implementation of 2)).
Implementation of 1) would be quite nice when multiplication of several matrices is concerned.
== Olivier
2009/6/7 Tom K. tpk@kraussfamily.org
Olivier Verdier-2 wrote:
There would be a much simpler solution than allowing a new operator. Just allow the numpy function dot to take more than two arguments. Then A*B*C in matrix notation would simply be: dot(A,B,C)
with arrays. Wouldn't that make everybody happy? Plus it does not break backward compatibility. Am I missing something?
That wouldn't make me happy because it is not the same syntax as a binary infix operator. Introducing a new operator for matrix multiply (and possibly matrix exponentiation) does not break backward compatibility - how could it, given that the python language does not yet support the new operator?
Going back to Alan Isaac's example:
- beta = (X.T*X).I * X.T * Y
- beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y)
With a multiple arguments to dot, 2) becomes: 3) beta = np.dot(la.inv(np.dot(X.T, X)), X.T, Y)
This is somewhat better than 2) but not as nice as 1) IMO.
Seeing 1) with @'s would take some getting used but I think we would adjust.
For ".I" I would propose that ".I" be added to nd-arrays that inverts each matrix of the last two dimensions, so for example if X is 3D then X.I is the same as np.array([inv(Xi) for Xi in X]). This is also backwards compatible. With this behavior and the one I proposed for @, by adding preceding dimensions we are allowing doing matrix algebra on collections of matrices (although it looks like we might need a new .T that just swaps the last two dimensions to really pull that off). But a ".I" attribute and its behavior needn't be bundled with whatever proposal we wish to make to the python community for a new operator of course.
Regards, Tom K. -- View this message in context: http://www.nabble.com/matrix-default-to-column-vector--tp23652920p23910425.h... Sent from the Numpy-discussion mailing list archive at Nabble.com.
Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Olivier Verdier-2 wrote:
There are two solutions to the A*B*C problem that are not quite comparable, and are not mutually exclusive either.
- allow dot(A,B,C): this would be a great improvement over
dot(dot(A,B),C), and it could virtually be done within a day. It is easy to implement, does not require a new syntax, and does not break BC
- another solution, not incompatible with the first one, is to introduce
a new operator in the python language. In the case that it be accepted by the python community at large (which is very unlikely, IMHO), be prepared to a very long time before it is actually implemented. We are talking about several years.
I think that solution 1) is much more realistic than 2) (and again, they are not mutually exclusive, so implementing 1) does not preclude for a future implementation of 2)).
Implementation of 1) would be quite nice when multiplication of several matrices is concerned.
I agree these are not mutually exclusive. In other words, they are separate issues. So sure while I would not be *as* happy with a multi-input dot as I would with a new operator that could overload dot, I don't mean to discourage the multi-input dot from being pursued. By all means, it seems like a worthwhile addition if done correctly. I just don't think it solves the problem, that being how do we improve the semantics of numpy to be more "matrix" based.
It is a *requirement* that the package support * (or some binary infix operator) for matrix mutliplication. We do this with the matrix type. But, almost all experienced users drift away from matrix toward array as they find the matrix class too limiting or strange - it seems only applicable for new users and pedagogical purposes. My own experience is that having two ways to do things makes software confusing and overly complex. It would be preferable to have a single class that supports matrix multiplication syntax. If that takes a year or two until it "hits the street" as a numpy release, so be it... we've got time. I think your time estimate is correct, although the actual time to implement and test the new python syntax would probably be much shorter - it is not so much a technical issue as one of socialization. What do we want to do? How can we convince the larger python community that this is a good idea for them too?
Cheers, Tom K.

On Sun, Jun 7, 2009 at 07:20, Tom K. tpk@kraussfamily.org wrote:
Olivier Verdier-2 wrote:
There would be a much simpler solution than allowing a new operator. Just allow the numpy function dot to take more than two arguments. Then A*B*C in matrix notation would simply be: dot(A,B,C)
with arrays. Wouldn't that make everybody happy? Plus it does not break backward compatibility. Am I missing something?
That wouldn't make me happy because it is not the same syntax as a binary infix operator. Introducing a new operator for matrix multiply (and possibly matrix exponentiation) does not break backward compatibility - how could it, given that the python language does not yet support the new operator?
Going back to Alan Isaac's example:
- beta = (X.T*X).I * X.T * Y
- beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y)
With a multiple arguments to dot, 2) becomes: 3) beta = np.dot(la.inv(np.dot(X.T, X)), X.T, Y)
This is somewhat better than 2) but not as nice as 1) IMO.
4) beta = la.lstsq(X, Y)[0]
I really hate that example.
Seeing 1) with @'s would take some getting used but I think we would adjust.
For ".I" I would propose that ".I" be added to nd-arrays that inverts each matrix of the last two dimensions, so for example if X is 3D then X.I is the same as np.array([inv(Xi) for Xi in X]). This is also backwards compatible. With this behavior and the one I proposed for @, by adding preceding dimensions we are allowing doing matrix algebra on collections of matrices (although it looks like we might need a new .T that just swaps the last two dimensions to really pull that off). But a ".I" attribute and its behavior needn't be bundled with whatever proposal we wish to make to the python community for a new operator of course.
I am vehemently against adding .I to ndarray. I want to *discourage* the formation of explicit inverses. It is almost always a very wrong thing to do.

Robert Kern-2 wrote:
On Sun, Jun 7, 2009 at 07:20, Tom K. tpk@kraussfamily.org wrote:
Going back to Alan Isaac's example:
- beta = (X.T*X).I * X.T * Y
... 4) beta = la.lstsq(X, Y)[0]
I really hate that example.
Understood. Maybe propose a different one?
Robert Kern-2 wrote:
Seeing 1) with @'s would take some getting used but I think we would adjust.
For ".I" I would propose that ".I" be added to nd-arrays that inverts each matrix of the last two dimensions, so for example if X is 3D then X.I is the same as np.array([inv(Xi) for Xi in X]). This is also backwards compatible. With this behavior and the one I proposed for @, by adding preceding dimensions we are allowing doing matrix algebra on collections of matrices (although it looks like we might need a new .T that just swaps the last two dimensions to really pull that off). But a ".I" attribute and its behavior needn't be bundled with whatever proposal we wish to make to the python community for a new operator of course.
I am vehemently against adding .I to ndarray. I want to *discourage* the formation of explicit inverses. It is almost always a very wrong thing to do.
You sound like Cleve Moler: always concerned about numeric fidelity. Point taken.
- Tom K.

Going back to Alan Isaac's example:
- beta = (X.T*X).I * X.T * Y
- beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y)
Robert Kern wrote:
- beta = la.lstsq(X, Y)[0]
I really hate that example.
Remember, the example is a **teaching** example. I actually use NumPy in a Master's level math econ course (among other places). As it happens, I do get around to explaining why using an explicit inverse is a bad idea numerically, but that is entirely an aside in a course that is not concerned with numerical methods. It is concerned only with mastering a few basic math tools, and being able to implement some of them in code is largely a check on understanding and precision (and to provide basic background for future applications). Having them use lstsq is counterproductive for the material being covered, at least initially.
A typical course of this type uses Excel or includes no applications at all. So please, show a little gratitude. ;-)
Alan Isaac

On Mon, Jun 8, 2009 at 14:10, Alan G Isaacaisaac@american.edu wrote:
Going back to Alan Isaac's example:
- beta = (X.T*X).I * X.T * Y
- beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y)
Robert Kern wrote:
- beta = la.lstsq(X, Y)[0]
I really hate that example.
Remember, the example is a **teaching** example.
I know. Honestly, I would prefer that teachers skip over the normal equations entirely and move directly to decomposition approaches. If you are going to make them implement least-squares from more basic tools, I think it's more enlightening as a student to start with the SVD than the normal equations.
I actually use NumPy in a Master's level math econ course (among other places). As it happens, I do get around to explaining why using an explicit inverse is a bad idea numerically, but that is entirely an aside in a course that is not concerned with numerical methods. It is concerned only with mastering a few basic math tools, and being able to implement some of them in code is largely a check on understanding and precision (and to provide basic background for future applications). Having them use lstsq is counterproductive for the material being covered, at least initially.
A typical course of this type uses Excel or includes no applications at all. So please, show a little gratitude. ;-)
If it's not a class where they are going to use what they learn in the future to write numerical programs, I really don't care whether you teach it with numpy or not.
If it *is* such a class, then I would prefer that the students get taught the right way to write numerical programs.

On Mon, Jun 8, 2009 at 3:33 PM, Robert Kernrobert.kern@gmail.com wrote:
On Mon, Jun 8, 2009 at 14:10, Alan G Isaacaisaac@american.edu wrote:
Going back to Alan Isaac's example:
- beta = (X.T*X).I * X.T * Y
- beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y)
Robert Kern wrote:
- beta = la.lstsq(X, Y)[0]
I really hate that example.
Remember, the example is a **teaching** example.
I know. Honestly, I would prefer that teachers skip over the normal equations entirely and move directly to decomposition approaches. If you are going to make them implement least-squares from more basic tools, I think it's more enlightening as a student to start with the SVD than the normal equations.
I actually use NumPy in a Master's level math econ course (among other places). As it happens, I do get around to explaining why using an explicit inverse is a bad idea numerically, but that is entirely an aside in a course that is not concerned with numerical methods. It is concerned only with mastering a few basic math tools, and being able to implement some of them in code is largely a check on understanding and precision (and to provide basic background for future applications). Having them use lstsq is counterproductive for the material being covered, at least initially.
A typical course of this type uses Excel or includes no applications at all. So please, show a little gratitude. ;-)
If it's not a class where they are going to use what they learn in the future to write numerical programs, I really don't care whether you teach it with numpy or not.
If it *is* such a class, then I would prefer that the students get taught the right way to write numerical programs.
I started in such a class (with Dr. Isaac as a matter of fact). I found the use of Python with Numpy to be very enlightening for the basic concepts of linear algebra. I appreciated the simple syntax of matrices at the time as a gentler learning curve since my background in programming was mainly at a hobbyist level.
I then went on to take a few econometrics courses where we learned the normal equations. Now a few years later I am working on scipy.stats as a google summer of code project, and I am learning why a SVD decomposition is much more efficient (an economist never necessarily *needs* to know what's under the hood of their stats package). The intuition for the numerical methods was in place, as well as the basic familiarity with numpy/scipy. So I would not discount this approach too much. People get what they want out of anything, and I was happy to learn about Python and Numpy/Scipy as alternatives to proprietary packages. And I hope my work this summer can contribute even a little to making the project an accessible alternative for researchers without a strong technical background.
Skipper

jseabold wrote:
On Mon, Jun 8, 2009 at 3:33 PM, Robert Kernrobert.kern@gmail.com wrote:
On Mon, Jun 8, 2009 at 14:10, Alan G Isaacaisaac@american.edu wrote:
Going back to Alan Isaac's example:
- beta = (X.T*X).I * X.T * Y
- beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y)
Robert Kern wrote:
- beta = la.lstsq(X, Y)[0]
I really hate that example.
I propose the following alternative for discussion:
U, s, Vh = np.linalg.svd(X, full_matrices=False) 1) beta = Vh.T*np.asmatrix(np.diag(1/s))*U.T*Y 2) beta = np.dot(Vh.T, np.dot(np.diag(1/s), np.dot(U.T, Y)))
1) is with X and Y starting out as matrices, 2) is with arrays.
Even diag results in an array that has to be matricized. Sigh.

On Sun, Jun 14, 2009 at 10:20 AM, Tom K.tpk@kraussfamily.org wrote:
jseabold wrote:
On Mon, Jun 8, 2009 at 3:33 PM, Robert Kernrobert.kern@gmail.com wrote:
On Mon, Jun 8, 2009 at 14:10, Alan G Isaacaisaac@american.edu wrote:
Going back to Alan Isaac's example:
- beta = (X.T*X).I * X.T * Y
- beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y)
Robert Kern wrote:
- beta = la.lstsq(X, Y)[0]
I really hate that example.
I propose the following alternative for discussion:
U, s, Vh = np.linalg.svd(X, full_matrices=False) 1) beta = Vh.T*np.asmatrix(np.diag(1/s))*U.T*Y 2) beta = np.dot(Vh.T, np.dot(np.diag(1/s), np.dot(U.T, Y)))
- is with X and Y starting out as matrices, 2) is with .
. Sigh.
The problem is that I left the diagonal returned by svd as an array rather than a matrix for backward compatibility. Diag returns the diagonal when presented with a 2d matrix (array). I went back and forth on that, but not doing so would have required everyone to change their code to use diagflat instead of diag. I do note that diag doesn't preserve matrices and that looks like a bug to me.
This is also an argument against over-overloaded functions such as diag. Such functions are one of the blemishes of MatLab, IMHO. OTOH, there is something of a shortage of short, meaningful names
Chuck

2009/6/8 Robert Kern robert.kern@gmail.com:
Remember, the example is a **teaching** example.
I know. Honestly, I would prefer that teachers skip over the normal equations entirely and move directly to decomposition approaches. If you are going to make them implement least-squares from more basic tools, I think it's more enlightening as a student to start with the SVD than the normal equations.
I agree, and I wish our cirriculum followed that route. In linear algebra, I also don't much like the way eigenvalues are taught, where students have to solve characteristic polynomials by hand. When I teach the subject again, I'll pay more attention to these books:
Numerical linear algebra by Lloyd Trefethen http://books.google.co.za/books?id=bj-Lu6zjWbEC
(e.g. has SVD in Lecture 4)
Applied Numerical Linear Algebra by James Demmel http://books.google.co.za/books?id=lr8cFi-YWnIC
(e.g. has perturbation theory on page 4)
Regards Stéfan

2009/6/8 Stéfan van der Walt stefan@sun.ac.za:
2009/6/8 Robert Kern robert.kern@gmail.com:
Remember, the example is a **teaching** example.
I know. Honestly, I would prefer that teachers skip over the normal equations entirely and move directly to decomposition approaches. If you are going to make them implement least-squares from more basic tools, I think it's more enlightening as a student to start with the SVD than the normal equations.
I agree, and I wish our cirriculum followed that route. In linear algebra, I also don't much like the way eigenvalues are taught, where students have to solve characteristic polynomials by hand. When I teach the subject again, I'll pay more attention to these books:
Numerical linear algebra by Lloyd Trefethen http://books.google.co.za/books?id=bj-Lu6zjWbEC
(e.g. has SVD in Lecture 4)
Applied Numerical Linear Algebra by James Demmel http://books.google.co.za/books?id=lr8cFi-YWnIC
(e.g. has perturbation theory on page 4)
Regards Stéfan
Ok, I also have to give my 2 cents
Any basic econometrics textbook warns of multicollinearity. Since, economists are mostly interested in the parameter estimates, the covariance matrix needs to have little multicollinearity, otherwise the standard errors of the parameters will be huge.
If I use automatically pinv or lstsq, then, unless I look at the condition number and singularities, I get estimates that look pretty nice, even they have an "arbitrary" choice of the indeterminacy.
So in economics, I never worried too much about the numerical precision of the inverse, because, if the correlation matrix is close to singular, the model is misspecified, or needs reparameterization or the data is useless for the question.
Compared to endogeneity bias for example, or homoscedasticy assumptions and so on, the numerical problem is pretty small.
This doesn't mean matrix decomposition methods are not useful for numerical calculations and efficiency, but I don't think the numerical problem deserves a lot of emphasis in a basic econometrics class.
Josef

On Mon, Jun 8, 2009 at 15:21, josef.pktd@gmail.com wrote:
2009/6/8 Stéfan van der Walt stefan@sun.ac.za:
2009/6/8 Robert Kern robert.kern@gmail.com:
Remember, the example is a **teaching** example.
I know. Honestly, I would prefer that teachers skip over the normal equations entirely and move directly to decomposition approaches. If you are going to make them implement least-squares from more basic tools, I think it's more enlightening as a student to start with the SVD than the normal equations.
I agree, and I wish our cirriculum followed that route. In linear algebra, I also don't much like the way eigenvalues are taught, where students have to solve characteristic polynomials by hand. When I teach the subject again, I'll pay more attention to these books:
Numerical linear algebra by Lloyd Trefethen http://books.google.co.za/books?id=bj-Lu6zjWbEC
(e.g. has SVD in Lecture 4)
Applied Numerical Linear Algebra by James Demmel http://books.google.co.za/books?id=lr8cFi-YWnIC
(e.g. has perturbation theory on page 4)
Regards Stéfan
Ok, I also have to give my 2 cents
Any basic econometrics textbook warns of multicollinearity. Since, economists are mostly interested in the parameter estimates, the covariance matrix needs to have little multicollinearity, otherwise the standard errors of the parameters will be huge.
If I use automatically pinv or lstsq, then, unless I look at the condition number and singularities, I get estimates that look pretty nice, even they have an "arbitrary" choice of the indeterminacy.
So in economics, I never worried too much about the numerical precision of the inverse, because, if the correlation matrix is close to singular, the model is misspecified, or needs reparameterization or the data is useless for the question.
Compared to endogeneity bias for example, or homoscedasticy assumptions and so on, the numerical problem is pretty small.
This doesn't mean matrix decomposition methods are not useful for numerical calculations and efficiency, but I don't think the numerical problem deserves a lot of emphasis in a basic econometrics class.
Actually, my point is a bit broader. Numerics aside, if you are going to bother peeking under the hood of least-squares at all, I think the student gets a better understanding of least-squares via one of the decomposition methods rather than the normal equations.

On 6/6/2009 2:03 PM Charles R Harris apparently wrote:
So is eye(3)*(v.T*v) valid? If (v.T*v) is 1x1 you have incompatible dimensions for the multiplication
Exactly. So it is not valid. As you point out, to make it valid implies a loss of the associativity of matrix multiplication. Not a good idea!
Cheers, Alan

I took that very seriously when you said that matrices were important to you. Far from me the idea of forbidding numpy users to use matrices. My point was the fact that newcomers are confused by the presence of both matrices and arrays. I think that there should be only one matrix/vector/tensor object in numpy. Therefore I would advocate the removal of matrices from numpy.
*But* why not have matrices in a different component? Maybe as a part of scipy? or somewhere else? You would be more than welcome to use them anywhere. Note that I use components outside numpy for my teaching (scipy, sympy, mayavi, nosetest) and I don't have any problems with that.
With my "argument" I endeavoured to explain the potential complications of using matrices instead of arrays when teaching. Perhaps the strongest argument against matrices is that you cannot use vectors. I've taught enough matlab courses to realise the pain that this represents for students. But I realise also that somebody else would have a different experience.
Of course x.T*y should be a 1x1 matrix, this is not an anomaly, but it is confusing for students, because they expect a scalar. That is why I prefer to teach with dot. Then the relation matrix/vector/scalar is crystal clear.
== Olivier
2009/6/5 Alan G Isaac aisaac@american.edu
On 6/5/2009 11:38 AM Olivier Verdier apparently wrote:
I think matrices can be pretty tricky when used for teaching. For instance, you have to explain that all the operators work component-wise, except the multiplication! Another caveat is that since matrices are always 2x2, the "scalar product" of two column vectors computed as " x.T
- y" will not be a scalar, but a 2x2 matrix. There is also
the fact that you must cast all your vectors to column/raw matrices (as in matlab). For all these reasons, I prefer to use arrays and dot for teaching, and I have never had any complaints.
I do not understand this "argument". You should take it very seriously when someone reports to you that the matrix object is a crucial to them, e.g., as a teaching tool. Even if you do not find personally persuasive an example like http://mail.scipy.org/pipermail/numpy-discussion/2009-June/043001.html I have told you: this is important for my students. Reporting that your students do not complain about using arrays instead of matrices does not change this one bit.
Student backgrounds differ by domain of application. In economics, matrices are in *very* wide use, and multidimensional arrays get almost no use. Textbooks in econometrics (a huge and important field, even outside of economics) are full of proofs using matrix algebra. A close match to what the students see is crucial. When working with multiplication or exponentiation, matrices do what they expect, and 2d arrays do not.
One more point. As Python users we get used to installing a package here and a package there to add functionality. But this is not how most people looking for a matrix language see the world. Removing the matrix object from NumPy will raise the barrier to adoption by social scientists, and there should be a strongly persuasive reason before taking such a step.
Separately from all that, does anyone doubt that there is code that depends on the matrix object? The core objection to a past proposal for useful change was that it could break extant code. I would hope that nobody who took that position would subsequently propose removing the matrix object altogether.
Cheers, Alan Isaac
PS If x and y are "column vectors" (i.e., matrices), then x.T * y *should* be a 1×1 matrix. Since the * operator is doing matrix multiplication, this is the correct result, not an anomaly.
Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Sat, Jun 6, 2009 at 12:34 PM, Olivier Verdier zelbier@gmail.com wrote:
I took that very seriously when you said that matrices were important to you. Far from me the idea of forbidding numpy users to use matrices. My point was the fact that newcomers are confused by the presence of both matrices and arrays. I think that there should be only one matrix/vector/tensor object in numpy. Therefore I would advocate the removal of matrices from numpy.
*But* why not have matrices in a different component? Maybe as a part of scipy? or somewhere else? You would be more than welcome to use them anywhere. Note that I use components outside numpy for my teaching (scipy, sympy, mayavi, nosetest) and I don't have any problems with that.
With my "argument" I endeavoured to explain the potential complications of using matrices instead of arrays when teaching. Perhaps the strongest argument against matrices is that you cannot use vectors. I've taught enough matlab courses to realise the pain that this represents for students. But I realise also that somebody else would have a different experience.
Of course x.T*y should be a 1x1 matrix, this is not an anomaly, but it is confusing for students, because they expect a scalar. That is why I prefer to teach with dot. Then the relation matrix/vector/scalar is crystal clear.
How about the common expression
exp((v.t*A*v)/2)
do you expect a matrix exponential here? Or should the students write
exp(<v, A*v>/2)
where <...> is the inner product?
Chuck

On 6/6/2009 2:58 PM Charles R Harris apparently wrote:
How about the common expression exp((v.t*A*v)/2) do you expect a matrix exponential here?
I take your point that there are conveniences to treating a 1 by 1 matrix as a scalar. Most matrix programming languages do this, I think. For sure GAUSS does. The result of x' * A * x is a "matrix" (it has one row and one column) but it functions like a scalar (and even more, since right multiplication by it is also allowed).
While I think this is "wrong", especially in a language that readily distinguishes scalars and matrices, I recognize that many others have found the behavior useful. And I confess that when I talk about quadratic forms, I do treat x.T * A * x as if it were scalar.
But just to be clear, how are you proposing to implement that behavior, if you are?
Alan

On Sat, Jun 6, 2009 at 14:59, Alan G Isaac aisaac@american.edu wrote:
On 6/6/2009 2:58 PM Charles R Harris apparently wrote:
How about the common expression exp((v.t*A*v)/2) do you expect a matrix exponential here?
I take your point that there are conveniences to treating a 1 by 1 matrix as a scalar. Most matrix programming languages do this, I think. For sure GAUSS does. The result of x' * A * x is a "matrix" (it has one row and one column) but it functions like a scalar (and even more, since right multiplication by it is also allowed).
While I think this is "wrong", especially in a language that readily distinguishes scalars and matrices, I recognize that many others have found the behavior useful. And I confess that when I talk about quadratic forms, I do treat x.T * A * x as if it were scalar.
The old idea of introducing RowVector and ColumnVector would help here. If x were a ColumnVector and A a Matrix, then you can introduce the following rules:
x.T is a RowVector RowVector * ColumnVector is a scalar RowVector * Matrix is a RowVector Matrix * ColumnVector is a ColumnVector

On Sat, Jun 6, 2009 at 2:30 PM, Robert Kern robert.kern@gmail.com wrote:
On Sat, Jun 6, 2009 at 14:59, Alan G Isaac aisaac@american.edu wrote:
On 6/6/2009 2:58 PM Charles R Harris apparently wrote:
How about the common expression exp((v.t*A*v)/2) do you expect a matrix exponential here?
I take your point that there are conveniences to treating a 1 by 1 matrix as a scalar. Most matrix programming languages do this, I think. For sure GAUSS does. The result of x' * A * x is a "matrix" (it has one row and one column) but it functions like a scalar (and even more, since right multiplication by it is also allowed).
While I think this is "wrong", especially in a language that readily distinguishes scalars and matrices, I recognize that many others have found the behavior useful. And I confess that when I talk about quadratic forms, I do treat x.T * A * x as if it were scalar.
The old idea of introducing RowVector and ColumnVector would help here. If x were a ColumnVector and A a Matrix, then you can introduce the following rules:
x.T is a RowVector RowVector * ColumnVector is a scalar RowVector * Matrix is a RowVector Matrix * ColumnVector is a ColumnVector
Yes, that is another good solution. In tensor notation, RowVectors have signature r_i, ColumnVectors c^i, and matrices M^i_j. The '*' operator is then a contraction on adjacent indices, a result with no indices is a scalar, and the only problem that remains is the tensor product usually achieved by x*y.T. But making the exception that col * row is the tensor product producing a matrix would solve that and still raise an error for such things as col*row*row. Or we could simply require something like bivector(x,y)
Chuck

On 6/6/2009 4:30 PM Robert Kern apparently wrote:
The old idea of introducing RowVector and ColumnVector would help here. If x were a ColumnVector and A a Matrix, then you can introduce the following rules:
x.T is a RowVector RowVector * ColumnVector is a scalar RowVector * Matrix is a RowVector Matrix * ColumnVector is a ColumnVector
To me, a "row vector" is just a matrix with a single row, and a "column vector" is just a matrix with a single column. Calling them "vectors" is rather redundant, since matrices are also vectors (i.e., belong to a vector space).
I think the core of the row-vector/column-vector proposal is really the idea that we could have 1d objects that also have an "orientation" for the purposes of certain operations. But then why not just use matrices, which automatically provide that "orientation"?
Here are the 3 reasons I see: - to allow iteration over matrices to produce a less surprising result (*if* you find it surprising that a matrix is a container of matrices, as I do) - to allow 1d indexing of these "vectors" - to introduce a scalar product
I rather doubt (?) that these justify the added complexity of an additional array subclass.
Alan

On Sat, Jun 6, 2009 at 16:00, Alan G Isaac aisaac@american.edu wrote:
On 6/6/2009 4:30 PM Robert Kern apparently wrote:
The old idea of introducing RowVector and ColumnVector would help here. If x were a ColumnVector and A a Matrix, then you can introduce the following rules:
x.T is a RowVector RowVector * ColumnVector is a scalar RowVector * Matrix is a RowVector Matrix * ColumnVector is a ColumnVector
To me, a "row vector" is just a matrix with a single row, and a "column vector" is just a matrix with a single column. Calling them "vectors" is rather redundant, since matrices are also vectors (i.e., belong to a vector space).
I think the core of the row-vector/column-vector proposal is really the idea that we could have 1d objects that also have an "orientation" for the purposes of certain operations. But then why not just use matrices, which automatically provide that "orientation"?
Because (x.T * x) where x is an (n,1) matrix and * is matrix multiplication (i.e. MM(n,1) -> MM(1,1)) is not the same thing as the inner product of a vector (RR^n -> RR). Please see the post I was responding to for the motivation.

On Sat, Jun 6, 2009 at 3:00 PM, Alan G Isaac aisaac@american.edu wrote:
On 6/6/2009 4:30 PM Robert Kern apparently wrote:
The old idea of introducing RowVector and ColumnVector would help here. If x were a ColumnVector and A a Matrix, then you can introduce the following rules:
x.T is a RowVector RowVector * ColumnVector is a scalar RowVector * Matrix is a RowVector Matrix * ColumnVector is a ColumnVector
To me, a "row vector" is just a matrix with a single row, and a "column vector" is just a matrix with a single column. Calling them "vectors" is rather redundant, since matrices are also vectors (i.e., belong to a vector space).
Well, yes, linear mappings between vector spaces are also vector spaces, but it is useful to make the distinction. Likewise, L(x,L(y,z)) is multilinear map that factors through the tensor product of x,y,z. So on and so forth. At some point all these constructions are useful. But I think it is pernicious for a first course in matrix algebra to not distinguish between matrices and vectors. The abstraction to general linear spaces can come later.
I think the core of the row-vector/column-vector proposal is really the idea that we could have 1d objects that also have an "orientation" for the purposes of certain operations. But then why not just use matrices, which automatically provide that "orientation"?
Because at some point you want scalars. In matrix algebra matrices are generally considered maps between vector spaces. Covariance matrices don't fit that paradigm, but that is skimmed over. It's kind of a mess, actually.
Chuck

On Sat, Jun 6, 2009 at 1:59 PM, Alan G Isaac aisaac@american.edu wrote:
On 6/6/2009 2:58 PM Charles R Harris apparently wrote:
How about the common expression exp((v.t*A*v)/2) do you expect a matrix exponential here?
I take your point that there are conveniences to treating a 1 by 1 matrix as a scalar. Most matrix programming languages do this, I think. For sure GAUSS does. The result of x' * A * x is a "matrix" (it has one row and one column) but it functions like a scalar (and even more, since right multiplication by it is also allowed).
It's actually an inner product and the notation <x, A*x> would be technically correct. More generally, it is a bilinear function of two vectors. But the correct notation is a bit cumbersome for a student struggling with plain old matrices ;)
Ndarrays are actually closer to the tensor ideal in that M*v would be a contraction removing two indices from a three index tensor product. The "dot" function, aka *, then functions as a contraction. In this case x.T*A*x works just fine because A*x is 1D and x.T=x, so the final result is a scalar (0D array). So making vectors 1D arrays would solve some problems. There remains the construction v*v.T, which should really be treated as a tensor product, or bivector.
Chuck

On Sat, Jun 6, 2009 at 1:59 PM, Alan G Isaac <aisaac@american.edu For sure GAUSS does. The result of x' * A * x is a "matrix" (it has one row and one column) but it functions like a scalar (and even more, since right multiplication by it is also allowed).
On 6/6/2009 4:32 PM Charles R Harris apparently wrote:
It's actually an inner product
Sorry for the confusion: I was just reporting how GAUSS treats the expression x' * A * x.
Alan
participants (16)
-
Alan G Isaac
-
Charles R Harris
-
Christopher Barker
-
David Warde-Farley
-
Eric Firing
-
Fernando Perez
-
Geoffrey Ely
-
Jason Rennie
-
josef.pktd@gmail.com
-
Keith Goodman
-
Olivier Verdier
-
Robert Kern
-
Skipper Seabold
-
Stéfan van der Walt
-
Tom K.
-
Tommy Grav