<p><br>

On Feb 22, 2014 2:03 PM, "Nathaniel Smith" <<a href="mailto:njs@pobox.com">njs@pobox.com</a>> wrote:<br>

><br>

> Hi all,<br>

><br>

> Currently numpy's 'dot' acts a bit weird for ndim>2 or ndim<1. In<br>

> practice this doesn't usually matter much, because these are very<br>

> rarely used. But, I would like to nail down the behaviour so we can<br>

> say something precise in the matrix multiplication PEP. So here's one<br>

> proposal.<br>

><br>

> # CURRENT:<br>

><br>

> dot(0d, any) -> scalar multiplication<br>

> dot(any, 0d) -> scalar multiplication<br>

> dot(1d, 1d) -> inner product<br>

> dot(2d, 1d) -> treat 1d as column matrix, matrix-multiply, then<br>

> discard added axis<br>

> dot(1d, 2d) -> treat 1d as row matrix, matrix-multiply, then discard added axis<br>

> dot(2d, 2d) -> matrix multiply<br>

> dot(2-or-more d, 2-or-more d) -> a complicated outer product thing:<br>

> Specifically, if the inputs have shapes (r, n, m), (s, m, k), then<br>

> numpy returns an array with shape (r, s, n, k), created like:<br>

>     for i in range(r):<br>

>         for j in range(s):<br>

>             output[i, j, :, :] = np.dot(input1[i, :, :], input2[j, :, :])<br>

><br>

> # PROPOSED:<br>

><br>

> General rule: given dot on shape1, shape2, we try to match these<br>

> shapes against two templates like<br>

>   (..., n?, m) and (..., m, k?)<br>

> where ... indicates zero or more dimensions, and ? indicates an<br>

> optional axis. ? axes are always matched before ... axes, so for an<br>

> input with ndim>=2, the ? axis is always matched. An unmatched ? axis<br>

> is treated as having size 1.<br>

><br>

> Next, the ... axes are broadcast against each other in the usual way<br>

> (prepending 1s to make lengths the same, requiring corresponding<br>

> entries to either match or have the value 1).  And then the actual<br>

> computations are performed using the usual broadcasting rules.<br>

><br>

> Finally, we return an output with shape (..., n?, k?). Here "..."<br>

> indicates the result of broadcasting the input ...'s against each<br>

> other. And, n? and k? mean: "either the value taken from the input<br>

> shape, if the corresponding entry was matched -- but if no match was<br>

> made, then we leave this entry out." The idea is that just as a column<br>

> vector on the right is "m x 1", a 1d vector on the right is treated as<br>

> "m x <nothing>". For purposes of actually computing the product,<br>

> <nothing> acts like 1, as mentioned above. But it makes a difference<br>

> in what we return: in each of these cases we copy the input shape into<br>

> the output, so we can get an output with shape (n, <nothing>), or<br>

> (<nothing>, k), or (<nothing>, <nothing>), which work out to be (n,),<br>

> (k,) and (), respectively. This gives a (somewhat) intuitive principle<br>

> for why dot(1d, 1d), dot(1d, 2d), dot(2d, 1d) are handled the way they<br>

> are, and a general template for extending this behaviour to other<br>

> operations like gufunc 'solve'.<br>

><br>

> Anyway, the end result of this is that the PROPOSED behaviour differs<br>

> from the current behaviour in the following ways:<br>

> - passing 0d arrays to 'dot' becomes an error. (This in particular is<br>

> an important thing to know, because if core Python adds an operator<br>

> for 'dot', then we must decide what it should do for Python scalars,<br>

> which are logically 0d.)<br>

> - ndim>2 arrays are now handled by aligning and broadcasting the extra<br>

> axes, instead of taking an outer product. So dot((r, m, n), (r, n, k))<br>

> returns (r, m, k), not (r, r, m, k).<br>

><br>

> Comments?</p>

<p>The proposed behavior for ndim > 2 is what matrix_multiply (is it still in umath_tests?) does. The nice thing of the proposed new behavior is that the old behavior is easy to reproduce by fooling a little around with the shape of the first argument, while the opposite is not true.</p>


<p>Jaime</p>

<p>><br>

> --<br>

> Nathaniel J. Smith<br>

> Postdoctoral researcher - Informatics - University of Edinburgh<br>

> <a href="http://vorpus.org">http://vorpus.org</a><br>

> _______________________________________________<br>

> NumPy-Discussion mailing list<br>

> <a href="mailto:NumPy-Discussion@scipy.org">NumPy-Discussion@scipy.org</a><br>

> <a href="http://mail.scipy.org/mailman/listinfo/numpy-discussion">http://mail.scipy.org/mailman/listinfo/numpy-discussion</a><br>

</p>