On Wed, Apr 30, 2008 at 9:16 PM, Anne Archibald
2008/4/30 Charles R Harris
: Some operations on stacks of small matrices are easy to get, for instance, +,-,*,/, and matrix multiply. The last is the interesting one. If A and B are stacks of matrices with the same number of dimensions with the matrices stored in the last two indices, then
sum(A[...,:,:,newaxis]*B[...,newaxis,:,:], axis=-2)
is the matrix-wise multiplication of the two stacks. If B is replaced by a stack of 1D vectors, x, it is even simpler:
sum(A[...,:,:]*x[...,newaxis,:], axis=-1)
This doesn't go through BLAS, but for large stacks of small matrices it might be even faster than BLAS because BLAS is kinda slow for small matrices.
Yes and no. For the first operation, you have to create a temporary that is larger than either of the two input arrays. These invisible (potentially) gigantic temporaries are the sort of thing that puzzle users when as their problem size grows they suddenly find they hit a massive slowdown because it starts swapping to disk, and then a failure because the temporary can't be allocated. This is one reason we have dot() and tensordot() even though they can be expressed like this. (The other is of course that it lets us use optimized BLAS.)
But it is interesting that you can multiply stacks of matrices that way, is it not? I haven't seen it mentioned elsewhere. Chuck