[Numpy-discussion] Looking for description/insight/documentation on matmul

jeff saremi jeffsaremi at hotmail.com
Tue Jul 10 11:50:33 EDT 2018

Thanks a lot Matti. It makes a lot more sense now.
From: NumPy-Discussion <numpy-discussion-bounces+jeffsaremi=hotmail.com at python.org> on behalf of Matti Picus <matti.picus at gmail.com>
Sent: Monday, July 9, 2018 10:54 AM
To: numpy-discussion at python.org
Subject: Re: [Numpy-discussion] Looking for description/insight/documentation on matmul

On 09/07/18 09:48, jeff saremi wrote:
> Is there any resource available or anyone who's able to describe
> matmul operation of matrices when n > 2?
> The only description i can find is: "If either argument is N-D, N > 2,
> it is treated as a stack of matrices residing in the last two indexes
> and broadcast accordingly." which is very cryptic to me.
> Could someone break this down please?
> when a [2 3 5 6] is multiplied by a [7 8 9] what are the resulting
> dimensions? is there one answer to that? Is it deterministic?
> What does "residing in the last two indices" mean? What is broadcast
> and where?
> thanks
> jeff

You could do

np.matmul(np.ones((2, 3, 4, 5, 6)), np.ones((2, 3, 4, 6, 7))).shape

which yields (2, 3, 4, 5, 7).

When ndim >= 2 in both operands, matmul uses the last two dimensions as
(..., n, m) @ (...., m, p) -> (..., n, p). Note the repeating "m", so
your example would not work: n1=5, m1=6 in the first operand and m2=8,
p2=9 in the second so m1 != m2.

The "broadcast" refers only to the "..." dimensions, if in either of the
operands you replace the 2 or 3 or 4 with 1 then that operand will
broadcast (repeat itself) across that dimension to fit the other
operand. Also if one of the three first dimensions is missing in one of
the operands it will broadcast.

When ndim < 2 for one of the operands only, it will be interpreted as
"m", and the other dimension "n" or "p" will not appear on the output,
so the signature is (..., n, m),(m) -> (..., n) or (m),(..., m, p)->(..., p)

When ndim < 2 for both of the operands, it is the same as  a dot product
and will produce a scalar.

You didn't ask, but I will complete the picture: np.dot is different for
the case of n>=2. The result will extend (combine? broadcast across?)
both sets of ... dimensions, so

np.dot(np.ones((2,3,4,5,6)), np.ones((8, 9, 6, 7))).shape

which yields (2, 6, 4, 5, 8, 9, 7). The (2, 3, 4) dimensions are
followed by (8, 9)

NumPy-Discussion mailing list
NumPy-Discussion at python.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180710/4ad14578/attachment.html>

More information about the NumPy-Discussion mailing list