matmul needs some clarification.
Hi All, The problem arises when multiplying a stack of matrices times a vector. PEP465 defines this as appending a '1' to the dimensions of the vector and doing the defined stacked matrix multiply, then removing the last dimension from the result. Note that in the middle step we have a stack of matrices and after removing the last dimension we will still have a stack of matrices. What we want is a stack of vectors, but we can't have those with our conventions. This makes the result somewhat unexpected. How should we resolve this? Chuck
On Sat, May 30, 2015 at 6:23 PM, Charles R Harris wrote: The problem arises when multiplying a stack of matrices times a vector.
PEP465 defines this as appending a '1' to the dimensions of the vector and
doing the defined stacked matrix multiply, then removing the last dimension
from the result. Note that in the middle step we have a stack of matrices
and after removing the last dimension we will still have a stack of
matrices. What we want is a stack of vectors, but we can't have those with
our conventions. This makes the result somewhat unexpected. How should we
resolve this? I think that before tackling the @ operator, we should implement the pure
dot of stacks of matrices and dot of stacks of vectors generalized ufuncs.
The first will have a 2d "core" and the second - 1d. Let's tentatively
call them matmul and vecmul. Hopefully matrix vector product can be
reduced to the vecmul,
but I have not fully figured this out. If not - we may need the third
ufunc.
Once we have these ufuncs, we can decide what @ operator should do in terms
of them and possibly some axes manipulation.
On Sat, May 30, 2015 at 3:23 PM, Charles R Harris wrote: The problem arises when multiplying a stack of matrices times a vector.
PEP465 defines this as appending a '1' to the dimensions of the vector and
doing the defined stacked matrix multiply, then removing the last dimension
from the result. Note that in the middle step we have a stack of matrices
and after removing the last dimension we will still have a stack of
matrices. What we want is a stack of vectors, but we can't have those with
our conventions. This makes the result somewhat unexpected. How should we
resolve this? I'm afraid I don't quite understand the issue. Maybe a more specific
example of the shapes you have in mind would help? Here's my attempt.
Suppose we have two arrays:
a with shape (i, j, k)
b with shape (k,)
Following the logic you describe from PEP465, for a @ b we have shapes
transform like so:
(i, j, k,) @ (k, 1) -> (i, j, 1) -> (i, j)
This makes sense to me as a stack of vectors, as long as you are imagining
the original stack of matrices as along the first dimension. Which I'll
note is the default behavior for the new np.stack (
https://github.com/numpy/numpy/pull/5605).
On Wed, Jun 3, 2015 at 2:25 PM, Stephan Hoyer
On Sat, May 30, 2015 at 3:23 PM, Charles R Harris < charlesr.harris@gmail.com> wrote:
The problem arises when multiplying a stack of matrices times a vector. PEP465 defines this as appending a '1' to the dimensions of the vector and doing the defined stacked matrix multiply, then removing the last dimension from the result. Note that in the middle step we have a stack of matrices and after removing the last dimension we will still have a stack of matrices. What we want is a stack of vectors, but we can't have those with our conventions. This makes the result somewhat unexpected. How should we resolve this?
I'm afraid I don't quite understand the issue. Maybe a more specific example of the shapes you have in mind would help? Here's my attempt.
Suppose we have two arrays: a with shape (i, j, k) b with shape (k,)
Following the logic you describe from PEP465, for a @ b we have shapes transform like so: (i, j, k,) @ (k, 1) -> (i, j, 1) -> (i, j)
This makes sense to me as a stack of vectors, as long as you are imagining the original stack of matrices as along the first dimension. Which I'll note is the default behavior for the new np.stack ( https://github.com/numpy/numpy/pull/5605).
Yes, you end up with a stack of vectors, but matmul will interpret them as a stack of matrices. I decided there is nothing to be done there and just documented it as a potential gotcha. The other possibility would be to prohibit or warn on stacked matrices and vectors for the `@` operator and that might limit what some folks want to do. Chuck
On Wed, Jun 3, 2015 at 1:38 PM, Alexander Belopolsky
On Sat, May 30, 2015 at 6:23 PM, Charles R Harris < charlesr.harris@gmail.com> wrote:
The problem arises when multiplying a stack of matrices times a vector. PEP465 defines this as appending a '1' to the dimensions of the vector and doing the defined stacked matrix multiply, then removing the last dimension from the result. Note that in the middle step we have a stack of matrices and after removing the last dimension we will still have a stack of matrices. What we want is a stack of vectors, but we can't have those with our conventions. This makes the result somewhat unexpected. How should we resolve this?
I think that before tackling the @ operator, we should implement the pure dot of stacks of matrices and dot of stacks of vectors generalized ufuncs. The first will have a 2d "core" and the second - 1d. Let's tentatively call them matmul and vecmul. Hopefully matrix vector product can be reduced to the vecmul, but I have not fully figured this out. If not - we may need the third ufunc.
The `@` operator is done. I originally started with four ufuncs, mulvecvec, mulmatvec, etc, but decided to wait on that until we merged the ufunc and multiarray packages and did some other ufunc work. The matmul function can certainly be upgraded in the future, but is as good as dot right now except it doesn't handle object arrays.
Once we have these ufuncs, we can decide what @ operator should do in terms of them and possibly some axes manipulation.
Chuck
On Thu, Jun 4, 2015 at 6:50 PM, Alexander Belopolsky
On Wed, Jun 3, 2015 at 5:12 PM, Charles R Harris < charlesr.harris@gmail.com> wrote:
but is as good as dot right now except it doesn't handle object arrays.
This is a fairly low standard. :-(
Meaning as fast. I expect ufuncs to have more call overhead and they need to use blas to be competitive for float et al. Chuck
participants (3)
-
Alexander Belopolsky
-
Charles R Harris
-
Stephan Hoyer