[Numpy-discussion] Proposal: Chaining np.dot with mdot helper function

Tue Feb 18 04:17:56 EST 2014

Just to give an idea about the performance implications I timed the
operations on my machine

%timeit reduce(dotp, [x, v, x.T, y]).shape
1 loops, best of 3: 1.32 s per loop

%timeit reduce(dotTp, [x, v, x.T, y][::-1]).shape
1000 loops, best of 3: 394 µs per loop

I was just interested in a nicer formulas but if the side effect is a
performance improvement I can live with that.

Pauli Virtanen posed in the issue an older discussion on the mailinglist:
http://thread.gmane.org/gmane.comp.python.numeric.general/14288/

Beste Grüße,
 Stefan

On Tue, Feb 18, 2014 at 12:52 AM,  <josef.pktd at gmail.com> wrote:
> On Mon, Feb 17, 2014 at 4:57 PM,  <josef.pktd at gmail.com> wrote:
>> On Mon, Feb 17, 2014 at 4:39 PM, Stefan Otte <stefan.otte at gmail.com> wrote:
>>> Hey guys,
>>>
>>> I wrote myself a little helper function `mdot` which chains np.dot for
>>> multiple arrays. So I can write
>>>
>>>     mdot(A, B, C, D, E)
>>>
>>> instead of these
>>>
>>>     A.dot(B).dot(C).dot(D).dot(E)
>>>     np.dot(np.dot(np.dot(np.dot(A, B), C), D), E)
>>>
>>> I know you can use `numpy.matrix` to get nicer formulas. However, most
>>> numpy/scipy function return arrays instead of numpy.matrix. Therefore,
>>> sometimes you actually use array multiplication when you think you use
>>> matrix multiplication. `mdot` is a simple way to avoid using
>>> numpy.matrix but to improve the readability.
>>>
>>> What do you think? Is this useful and worthy to integrate in numpy?
>>>
>>>
>>> I already created an issuer for this:
>>> https://github.com/numpy/numpy/issues/4311
>>>
>>> jaimefrio also suggested to do some reordering of the arrays to
>>> minimize computation:
>>> https://github.com/numpy/numpy/issues/4311#issuecomment-35295857
>>
>> statsmodels has a convenience chaindot, but most of the time I don't
>> like it's usage, because of the missing brackets.
>>
>> say, you have a (10000, 10) array and you use an intermediate (10000,
>> 10000) array instead of (10,10) array
>
>>>> nobs = 10000
>>>> v = np.diag(np.ones(4))
>>>> x = np.random.randn(nobs, 4)
>>>> y = np.random.randn(nobs, 3)
>>>> reduce(np.dot, [x, v, x.T, y]).shape
>
>
>>>> def dotp(x, y):
> xy = np.dot(x,y)
> print xy.shape
> return xy
>
>>>> reduce(dotp, [x, v, x.T, y]).shape
> (10000, 4)
> (10000, 10000)
> (10000, 3)
> (10000, 3)
>
>>>> def dotTp(x, y):
> xy = np.dot(x.T,y.T)
> print xy.shape
> return xy.T
>
>>>> reduce(dotTp, [x, v, x.T, y][::-1]).shape
> (3, 4)
> (3, 4)
> (3, 10000)
> (10000, 3)
>
> Josef
>
>>
>> IIRC, for reordering I looked at this
>> http://www.mathworks.com/matlabcentral/fileexchange/27950-mmtimes-matrix-chain-product
>>
>> Josef
>> (don't make it too easy for people to shoot themselves in ...)
>>
>>>
>>>
>>> Best,
>>>  Stefan
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion