Confused by spec of numpy.linalg.solve
Versions:
sys.version '3.3.2 (default, Mar 5 2014, 08:21:05) \n[GCC 4.8.2 20131212 (Red Hat 4.8.2-7)]'
numpy.__version__ '1.8.0'
Problem: I'm trying to unpick the shape requirements of numpy.linalg.solve(). The help text says: solve(a, b) - a : (..., M, M) array_like Coefficient matrix. b : {(..., M,), (..., M, K)}, array_like Ordinate or "dependent variable" values. It's the requirements on "b" that are giving me grief. My read of the help text is that "b" must have a shape with either its final axis or its penultimate axis equal to M in size. Which axis the matrix contraction is along depends on the size of the final axis of "b". So, according to my reading, if "b" has shape (6,3) then the first choice, "(..., M,)", is invoked but if "a" has shape (3,3) and "b" has shape (3,6) then the second choice, "(..., M, K)", is invoked. I find this weird, but I've dealt with (much) weirder. However, this is not what I see. When "b" has shape (3,6) everything goes as expected. When "b" has shape (6,3) I get an error message that 6 is not equal to 3:
ValueError: solve: Operand 1 has a mismatch in its core dimension 0, with gufunc signature (m,m),(m,n)->(m,n) (size 6 is different from 3)
Obviously my reading is incorrect. Can somebody elucidate for me exactly what the requirements are on the shape of "b"? Example code: import numpy import numpy.linalg # Works. M = numpy.array([ [1.0, 1.0/2.0, 1.0/3.0], [1.0/2.0, 1.0/3.0, 1.0/4.0], [1.0/3.0, 1.0/4.0, 1.0/5.0] ] ) yy1 = numpy.array([ [1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0] ]) print(yy1.shape) xx1 = numpy.linalg.solve(M, yy1) print(xx1) # Works too. yy2 = numpy.array([ [1.0, 0.0, 0.0, 1.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0, 1.0, 0.0], [0.0, 0.0, 1.0, 0.0, 0.0, 1.0] ]) print(yy2.shape) xx2 = numpy.linalg.solve(M, yy2) print(xx2) # Fails. yy3 = numpy.array([ [1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0], [1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0] ]) print(yy3.shape) xx3 = numpy.linalg.solve(M, yy3) print(xx3)
On Di, 2014-04-01 at 15:31 +0100, Bob Dowling wrote:
Versions:
sys.version '3.3.2 (default, Mar 5 2014, 08:21:05) \n[GCC 4.8.2 20131212 (Red Hat 4.8.2-7)]'
numpy.__version__ '1.8.0'
Problem:
I'm trying to unpick the shape requirements of numpy.linalg.solve(). The help text says:
solve(a, b) - a : (..., M, M) array_like Coefficient matrix. b : {(..., M,), (..., M, K)}, array_like Ordinate or "dependent variable" values.
It's the requirements on "b" that are giving me grief. My read of the help text is that "b" must have a shape with either its final axis or its penultimate axis equal to M in size. Which axis the matrix contraction is along depends on the size of the final axis of "b".
So, according to my reading, if "b" has shape (6,3) then the first choice, "(..., M,)", is invoked but if "a" has shape (3,3) and "b" has shape (3,6) then the second choice, "(..., M, K)", is invoked. I find this weird, but I've dealt with (much) weirder.
I bet the documentation needs some more info there (if you have time, please write a pull request). If you look at the code (that part is just python code), you will see what really happens. If `a` has exactly one dimension more then `b`, the first case is used. Otherwise (..., M, K) is used instead. To make sure you always get the expected result, it may be best to make sure that the number of broadcasting (...) dimensions of `a` and `b` are identical (I am not sure if you expect this to be the case or not). The shape itself does not matter, only the (relative) number of dimensions does for the decision which of the two signatures is used. In other words, since you do not use `...` your examples always use the (M, K) logic. - Sebastian
However, this is not what I see. When "b" has shape (3,6) everything goes as expected. When "b" has shape (6,3) I get an error message that 6 is not equal to 3:
ValueError: solve: Operand 1 has a mismatch in its core dimension 0, with gufunc signature (m,m),(m,n)->(m,n) (size 6 is different from 3)
Obviously my reading is incorrect. Can somebody elucidate for me exactly what the requirements are on the shape of "b"?
Example code:
import numpy import numpy.linalg
# Works. M = numpy.array([ [1.0, 1.0/2.0, 1.0/3.0], [1.0/2.0, 1.0/3.0, 1.0/4.0], [1.0/3.0, 1.0/4.0, 1.0/5.0] ] )
yy1 = numpy.array([ [1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0] ]) print(yy1.shape) xx1 = numpy.linalg.solve(M, yy1) print(xx1)
# Works too. yy2 = numpy.array([ [1.0, 0.0, 0.0, 1.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0, 1.0, 0.0], [0.0, 0.0, 1.0, 0.0, 0.0, 1.0] ]) print(yy2.shape) xx2 = numpy.linalg.solve(M, yy2) print(xx2)
# Fails. yy3 = numpy.array([ [1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0], [1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0] ]) print(yy3.shape) xx3 = numpy.linalg.solve(M, yy3) print(xx3)
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Tue, Apr 1, 2014 at 3:57 PM, Sebastian Berg <sebastian@sipsolutions.net> wrote:
If `a` has exactly one dimension more then `b`, the first case is used. Otherwise (..., M, K) is used instead. To make sure you always get the expected result, it may be best to make sure that the number of broadcasting (...) dimensions of `a` and `b` are identical (I am not sure if you expect this to be the case or not). The shape itself does not matter, only the (relative) number of dimensions does for the decision which of the two signatures is used.
Oh, really? This seems really unfortunate -- AFAICT it makes it impossible to write a generic broadcasting matrix-solve or vector-solve :-/ (except by explicitly checking shapes and prepending ones by hand, more or less doing the broadcasting manually). Surely it would be better to use PEP 467 style broadcasting, where the only special case is if `b` has exactly 1 dimension? -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org
On 04/01/2014 04:25 PM, Nathaniel Smith wrote:
On Tue, Apr 1, 2014 at 3:57 PM, Sebastian Berg <sebastian@sipsolutions.net> wrote:
If `a` has exactly one dimension more then `b`, the first case is used. Otherwise (..., M, K) is used instead. To make sure you always get the expected result, it may be best to make sure that the number of broadcasting (...) dimensions of `a` and `b` are identical (I am not sure if you expect this to be the case or not). The shape itself does not matter, only the (relative) number of dimensions does for the decision which of the two signatures is used. Oh, really? This seems really unfortunate
It also seems quite counter-intuitive. It means that an array "a" of shape (3,3) will behave radically differently to one of shape (1,3,3). But thank you for the explanation.
On Di, 2014-04-01 at 16:25 +0100, Nathaniel Smith wrote:
On Tue, Apr 1, 2014 at 3:57 PM, Sebastian Berg <sebastian@sipsolutions.net> wrote:
If `a` has exactly one dimension more then `b`, the first case is used. Otherwise (..., M, K) is used instead. To make sure you always get the expected result, it may be best to make sure that the number of broadcasting (...) dimensions of `a` and `b` are identical (I am not sure if you expect this to be the case or not). The shape itself does not matter, only the (relative) number of dimensions does for the decision which of the two signatures is used.
Since b is a system of equations if it is 2-dim, I think it basically doesn't make sense to have a (M, K) shaped b anyway, since you could use a (K, M) shaped b with broadcasting logic (though I guess that is slower unless you add extra logic). - Sebastian
Oh, really? This seems really unfortunate -- AFAICT it makes it impossible to write a generic broadcasting matrix-solve or vector-solve :-/ (except by explicitly checking shapes and prepending ones by hand, more or less doing the broadcasting manually). Surely it would be better to use PEP 467 style broadcasting, where the only special case is if `b` has exactly 1 dimension?
-n
On Tue, Apr 1, 2014 at 9:50 PM, Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Di, 2014-04-01 at 16:25 +0100, Nathaniel Smith wrote:
On Tue, Apr 1, 2014 at 3:57 PM, Sebastian Berg <sebastian@sipsolutions.net> wrote:
If `a` has exactly one dimension more then `b`, the first case is used. Otherwise (..., M, K) is used instead. To make sure you always get the expected result, it may be best to make sure that the number of broadcasting (...) dimensions of `a` and `b` are identical (I am not sure if you expect this to be the case or not). The shape itself does not matter, only the (relative) number of dimensions does for the decision which of the two signatures is used.
Since b is a system of equations if it is 2-dim, I think it basically doesn't make sense to have a (M, K) shaped b anyway, since you could use a (K, M) shaped b with broadcasting logic (though I guess that is slower unless you add extra logic).
Not sure I'm following your point exactly, but the argument for having (M, M) `a` and (M, K) `b` is that solve(a, b) is the same as dot(inv(a), b), which obviously accepts 2d `a` and `b`... -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org
participants (3)
-
Bob Dowling -
Nathaniel Smith -
Sebastian Berg