Dot/inner products with broadcasting?
Hi! How can I compute dot product (or similar multiply&sum operations) efficiently so that broadcasting is utilized? For multi-dimensional arrays, NumPy's inner and dot functions do not match the leading axes and use broadcasting, but instead the result has first the leading axes of the first input array and then the leading axes of the second input array. For instance, I would like to compute the following inner-product: np.sum(A*B, axis=-1) But numpy.inner gives: A = np.random.randn(2,3,4) B = np.random.randn(3,4) np.inner(A,B).shape # -> (2, 3, 3) instead of (2, 3) Similarly for dot product, I would like to compute for instance: np.sum(A[...,:,:,np.newaxis]*B[...,np.newaxis,:,:], axis=-2) But numpy.dot gives: In [12]: A = np.random.randn(2,3,4); B = np.random.randn(2,4,5) In [13]: np.dot(A,B).shape # -> (2, 3, 2, 5) instead of (2, 3, 5) I could use einsum for these operations, but I'm not sure whether that's as efficient as using some BLAS-supported(?) dot products. I couldn't find any function which could perform this kind of operations. NumPy's functions seem to either flatten the input arrays (vdot, outer) or just use the axes of the input arrays separately (dot, inner, tensordot). Any help? Best regards, Jaakko
Answering to myself, this pull request seems to implement an inner product with broadcasting (inner1d) and many other useful functions: https://github.com/numpy/numpy/pull/2954/ -J On 03/13/2013 04:21 PM, Jaakko Luttinen wrote:
Hi!
How can I compute dot product (or similar multiply&sum operations) efficiently so that broadcasting is utilized? For multi-dimensional arrays, NumPy's inner and dot functions do not match the leading axes and use broadcasting, but instead the result has first the leading axes of the first input array and then the leading axes of the second input array.
For instance, I would like to compute the following inner-product: np.sum(A*B, axis=-1)
But numpy.inner gives: A = np.random.randn(2,3,4) B = np.random.randn(3,4) np.inner(A,B).shape # -> (2, 3, 3) instead of (2, 3)
Similarly for dot product, I would like to compute for instance: np.sum(A[...,:,:,np.newaxis]*B[...,np.newaxis,:,:], axis=-2)
But numpy.dot gives: In [12]: A = np.random.randn(2,3,4); B = np.random.randn(2,4,5) In [13]: np.dot(A,B).shape # -> (2, 3, 2, 5) instead of (2, 3, 5)
I could use einsum for these operations, but I'm not sure whether that's as efficient as using some BLAS-supported(?) dot products.
I couldn't find any function which could perform this kind of operations. NumPy's functions seem to either flatten the input arrays (vdot, outer) or just use the axes of the input arrays separately (dot, inner, tensordot).
Any help?
Best regards, Jaakko _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
In fact, there is already an inner1d implemented in numpy.core.umath_tests.inner1d from numpy.core.umath_tests import inner1d It should do the trick :) On Thu, Mar 14, 2013 at 12:54 PM, Jaakko Luttinen <jaakko.luttinen@aalto.fi> wrote:
Answering to myself, this pull request seems to implement an inner product with broadcasting (inner1d) and many other useful functions: https://github.com/numpy/numpy/pull/2954/ -J
On 03/13/2013 04:21 PM, Jaakko Luttinen wrote:
Hi!
How can I compute dot product (or similar multiply&sum operations) efficiently so that broadcasting is utilized? For multi-dimensional arrays, NumPy's inner and dot functions do not match the leading axes and use broadcasting, but instead the result has first the leading axes of the first input array and then the leading axes of the second input array.
For instance, I would like to compute the following inner-product: np.sum(A*B, axis=-1)
But numpy.inner gives: A = np.random.randn(2,3,4) B = np.random.randn(3,4) np.inner(A,B).shape # -> (2, 3, 3) instead of (2, 3)
Similarly for dot product, I would like to compute for instance: np.sum(A[...,:,:,np.newaxis]*B[...,np.newaxis,:,:], axis=-2)
But numpy.dot gives: In [12]: A = np.random.randn(2,3,4); B = np.random.randn(2,4,5) In [13]: np.dot(A,B).shape # -> (2, 3, 2, 5) instead of (2, 3, 5)
I could use einsum for these operations, but I'm not sure whether that's as efficient as using some BLAS-supported(?) dot products.
I couldn't find any function which could perform this kind of operations. NumPy's functions seem to either flatten the input arrays (vdot, outer) or just use the axes of the input arrays separately (dot, inner, tensordot).
Any help?
Best regards, Jaakko _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
I tried using this inner1d as an alternative to dot because it uses broadcasting. However, I found something surprising: Not only is inner1d much much slower than dot, it is also slower than einsum which is much more general: In [68]: import numpy as np In [69]: import numpy.core.gufuncs_linalg as gula In [70]: K = np.random.randn(1000,1000) In [71]: %timeit gula.inner1d(K[:,np.newaxis,:], np.swapaxes(K,-1,-2)[np.newaxis,:,:]) 1 loops, best of 3: 6.05 s per loop In [72]: %timeit np.dot(K,K) 1 loops, best of 3: 392 ms per loop In [73]: %timeit np.einsum('ik,kj->ij', K, K) 1 loops, best of 3: 1.24 s per loop Why is it so? I thought that the performance of inner1d would be somewhere in between dot and einsum, probably closer to dot. Now I don't see any reason to use inner1d instead of einsum.. -Jaakko On 03/15/2013 04:22 PM, Oscar Villellas wrote:
In fact, there is already an inner1d implemented in numpy.core.umath_tests.inner1d
from numpy.core.umath_tests import inner1d
It should do the trick :)
On Thu, Mar 14, 2013 at 12:54 PM, Jaakko Luttinen <jaakko.luttinen@aalto.fi> wrote:
Answering to myself, this pull request seems to implement an inner product with broadcasting (inner1d) and many other useful functions: https://github.com/numpy/numpy/pull/2954/ -J
On 03/13/2013 04:21 PM, Jaakko Luttinen wrote:
Hi!
How can I compute dot product (or similar multiply&sum operations) efficiently so that broadcasting is utilized? For multi-dimensional arrays, NumPy's inner and dot functions do not match the leading axes and use broadcasting, but instead the result has first the leading axes of the first input array and then the leading axes of the second input array.
For instance, I would like to compute the following inner-product: np.sum(A*B, axis=-1)
But numpy.inner gives: A = np.random.randn(2,3,4) B = np.random.randn(3,4) np.inner(A,B).shape # -> (2, 3, 3) instead of (2, 3)
Similarly for dot product, I would like to compute for instance: np.sum(A[...,:,:,np.newaxis]*B[...,np.newaxis,:,:], axis=-2)
But numpy.dot gives: In [12]: A = np.random.randn(2,3,4); B = np.random.randn(2,4,5) In [13]: np.dot(A,B).shape # -> (2, 3, 2, 5) instead of (2, 3, 5)
I could use einsum for these operations, but I'm not sure whether that's as efficient as using some BLAS-supported(?) dot products.
I couldn't find any function which could perform this kind of operations. NumPy's functions seem to either flatten the input arrays (vdot, outer) or just use the axes of the input arrays separately (dot, inner, tensordot).
Any help?
Best regards, Jaakko _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Well, thanks to seberg, I finally noticed that there is a dot product function in this new module numpy.core.gufuncs_linalg, it was just named differently (matrix_multiply instead of dot). However, I may have found a bug in it: import numpy.core.gufuncs_linalg as gula A = np.arange(2*2).reshape((2,2)) B = np.arange(2*1).reshape((2,1)) gula.matrix_multiply(A, B) ---- ValueError: On entry to DGEMM parameter number 10 had an illegal value -Jaakko On 03/20/2013 03:33 PM, Jaakko Luttinen wrote:
I tried using this inner1d as an alternative to dot because it uses broadcasting. However, I found something surprising: Not only is inner1d much much slower than dot, it is also slower than einsum which is much more general:
In [68]: import numpy as np
In [69]: import numpy.core.gufuncs_linalg as gula
In [70]: K = np.random.randn(1000,1000)
In [71]: %timeit gula.inner1d(K[:,np.newaxis,:], np.swapaxes(K,-1,-2)[np.newaxis,:,:]) 1 loops, best of 3: 6.05 s per loop
In [72]: %timeit np.dot(K,K) 1 loops, best of 3: 392 ms per loop
In [73]: %timeit np.einsum('ik,kj->ij', K, K) 1 loops, best of 3: 1.24 s per loop
Why is it so? I thought that the performance of inner1d would be somewhere in between dot and einsum, probably closer to dot. Now I don't see any reason to use inner1d instead of einsum..
-Jaakko
On 03/15/2013 04:22 PM, Oscar Villellas wrote:
In fact, there is already an inner1d implemented in numpy.core.umath_tests.inner1d
from numpy.core.umath_tests import inner1d
It should do the trick :)
On Thu, Mar 14, 2013 at 12:54 PM, Jaakko Luttinen <jaakko.luttinen@aalto.fi> wrote:
Answering to myself, this pull request seems to implement an inner product with broadcasting (inner1d) and many other useful functions: https://github.com/numpy/numpy/pull/2954/ -J
On 03/13/2013 04:21 PM, Jaakko Luttinen wrote:
Hi!
How can I compute dot product (or similar multiply&sum operations) efficiently so that broadcasting is utilized? For multi-dimensional arrays, NumPy's inner and dot functions do not match the leading axes and use broadcasting, but instead the result has first the leading axes of the first input array and then the leading axes of the second input array.
For instance, I would like to compute the following inner-product: np.sum(A*B, axis=-1)
But numpy.inner gives: A = np.random.randn(2,3,4) B = np.random.randn(3,4) np.inner(A,B).shape # -> (2, 3, 3) instead of (2, 3)
Similarly for dot product, I would like to compute for instance: np.sum(A[...,:,:,np.newaxis]*B[...,np.newaxis,:,:], axis=-2)
But numpy.dot gives: In [12]: A = np.random.randn(2,3,4); B = np.random.randn(2,4,5) In [13]: np.dot(A,B).shape # -> (2, 3, 2, 5) instead of (2, 3, 5)
I could use einsum for these operations, but I'm not sure whether that's as efficient as using some BLAS-supported(?) dot products.
I couldn't find any function which could perform this kind of operations. NumPy's functions seem to either flatten the input arrays (vdot, outer) or just use the axes of the input arrays separately (dot, inner, tensordot).
Any help?
Best regards, Jaakko _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Reproduced it. I will take a look at it. That error comes direct from BLAS and shouldn't be happening. I will also look why inner1d is not performing well. Note: inner1d is implemented with calls to BLAS (dot). I will get back to you later :) On Wed, Mar 20, 2013 at 4:10 PM, Jaakko Luttinen <jaakko.luttinen@aalto.fi> wrote:
Well, thanks to seberg, I finally noticed that there is a dot product function in this new module numpy.core.gufuncs_linalg, it was just named differently (matrix_multiply instead of dot).
However, I may have found a bug in it:
import numpy.core.gufuncs_linalg as gula A = np.arange(2*2).reshape((2,2)) B = np.arange(2*1).reshape((2,1)) gula.matrix_multiply(A, B) ---- ValueError: On entry to DGEMM parameter number 10 had an illegal value
-Jaakko
On 03/20/2013 03:33 PM, Jaakko Luttinen wrote:
I tried using this inner1d as an alternative to dot because it uses broadcasting. However, I found something surprising: Not only is inner1d much much slower than dot, it is also slower than einsum which is much more general:
In [68]: import numpy as np
In [69]: import numpy.core.gufuncs_linalg as gula
In [70]: K = np.random.randn(1000,1000)
In [71]: %timeit gula.inner1d(K[:,np.newaxis,:], np.swapaxes(K,-1,-2)[np.newaxis,:,:]) 1 loops, best of 3: 6.05 s per loop
In [72]: %timeit np.dot(K,K) 1 loops, best of 3: 392 ms per loop
In [73]: %timeit np.einsum('ik,kj->ij', K, K) 1 loops, best of 3: 1.24 s per loop
Why is it so? I thought that the performance of inner1d would be somewhere in between dot and einsum, probably closer to dot. Now I don't see any reason to use inner1d instead of einsum..
-Jaakko
On 03/15/2013 04:22 PM, Oscar Villellas wrote:
In fact, there is already an inner1d implemented in numpy.core.umath_tests.inner1d
from numpy.core.umath_tests import inner1d
It should do the trick :)
On Thu, Mar 14, 2013 at 12:54 PM, Jaakko Luttinen <jaakko.luttinen@aalto.fi> wrote:
Answering to myself, this pull request seems to implement an inner product with broadcasting (inner1d) and many other useful functions: https://github.com/numpy/numpy/pull/2954/ -J
On 03/13/2013 04:21 PM, Jaakko Luttinen wrote:
Hi!
How can I compute dot product (or similar multiply&sum operations) efficiently so that broadcasting is utilized? For multi-dimensional arrays, NumPy's inner and dot functions do not match the leading axes and use broadcasting, but instead the result has first the leading axes of the first input array and then the leading axes of the second input array.
For instance, I would like to compute the following inner-product: np.sum(A*B, axis=-1)
But numpy.inner gives: A = np.random.randn(2,3,4) B = np.random.randn(3,4) np.inner(A,B).shape # -> (2, 3, 3) instead of (2, 3)
Similarly for dot product, I would like to compute for instance: np.sum(A[...,:,:,np.newaxis]*B[...,np.newaxis,:,:], axis=-2)
But numpy.dot gives: In [12]: A = np.random.randn(2,3,4); B = np.random.randn(2,4,5) In [13]: np.dot(A,B).shape # -> (2, 3, 2, 5) instead of (2, 3, 5)
I could use einsum for these operations, but I'm not sure whether that's as efficient as using some BLAS-supported(?) dot products.
I couldn't find any function which could perform this kind of operations. NumPy's functions seem to either flatten the input arrays (vdot, outer) or just use the axes of the input arrays separately (dot, inner, tensordot).
Any help?
Best regards, Jaakko _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
participants (2)
-
Jaakko Luttinen
-
Oscar Villellas