[Numpy-discussion] dot product: large speed difference metween seemingly indentical operations

Sun Oct 18 00:38:04 EDT 2015

The functions dot, matmul and tensordot performs the same on a MxN matrix multiplied by length N vector, but very different if the matrix is replaced by a PxQxN array. Why?

In [3]: a = rand(1000000,3)

In [4]: a1 = a.reshape(1000,1000,3)

In [5]: w = rand(3)

In [6]: %timeit a.dot(w)
100 loops, best of 3: 3.47 ms per loop

In [7]: %timeit a1.dot(w)  # Very slow!
10 loops, best of 3: 25.5 ms per loop

In [8]: %timeit a at w
100 loops, best of 3: 3.45 ms per loop

In [9]: %timeit a1 at w
100 loops, best of 3: 6.77 ms per loop

In [10]: %timeit tensordot(a,w,1)
100 loops, best of 3: 3.44 ms per loop

In [11]: %timeit tensordot(a1,w,1)
100 loops, best of 3: 3.41 ms per loop

BTW, this is not a corner case, since PxQx3 arrays represent RGB images.

  Nadav