[Numpy-discussion] Proposal: Chaining np.dot with mdot helper function
Stefan Otte
stefan.otte at gmail.com
Tue Feb 18 04:17:56 EST 2014
Just to give an idea about the performance implications I timed the
operations on my machine
%timeit reduce(dotp, [x, v, x.T, y]).shape
1 loops, best of 3: 1.32 s per loop
%timeit reduce(dotTp, [x, v, x.T, y][::-1]).shape
1000 loops, best of 3: 394 µs per loop
I was just interested in a nicer formulas but if the side effect is a
performance improvement I can live with that.
Pauli Virtanen posed in the issue an older discussion on the mailinglist:
http://thread.gmane.org/gmane.comp.python.numeric.general/14288/
Beste Grüße,
Stefan
On Tue, Feb 18, 2014 at 12:52 AM, <josef.pktd at gmail.com> wrote:
> On Mon, Feb 17, 2014 at 4:57 PM, <josef.pktd at gmail.com> wrote:
>> On Mon, Feb 17, 2014 at 4:39 PM, Stefan Otte <stefan.otte at gmail.com> wrote:
>>> Hey guys,
>>>
>>> I wrote myself a little helper function `mdot` which chains np.dot for
>>> multiple arrays. So I can write
>>>
>>> mdot(A, B, C, D, E)
>>>
>>> instead of these
>>>
>>> A.dot(B).dot(C).dot(D).dot(E)
>>> np.dot(np.dot(np.dot(np.dot(A, B), C), D), E)
>>>
>>> I know you can use `numpy.matrix` to get nicer formulas. However, most
>>> numpy/scipy function return arrays instead of numpy.matrix. Therefore,
>>> sometimes you actually use array multiplication when you think you use
>>> matrix multiplication. `mdot` is a simple way to avoid using
>>> numpy.matrix but to improve the readability.
>>>
>>> What do you think? Is this useful and worthy to integrate in numpy?
>>>
>>>
>>> I already created an issuer for this:
>>> https://github.com/numpy/numpy/issues/4311
>>>
>>> jaimefrio also suggested to do some reordering of the arrays to
>>> minimize computation:
>>> https://github.com/numpy/numpy/issues/4311#issuecomment-35295857
>>
>> statsmodels has a convenience chaindot, but most of the time I don't
>> like it's usage, because of the missing brackets.
>>
>> say, you have a (10000, 10) array and you use an intermediate (10000,
>> 10000) array instead of (10,10) array
>
>>>> nobs = 10000
>>>> v = np.diag(np.ones(4))
>>>> x = np.random.randn(nobs, 4)
>>>> y = np.random.randn(nobs, 3)
>>>> reduce(np.dot, [x, v, x.T, y]).shape
>
>
>>>> def dotp(x, y):
> xy = np.dot(x,y)
> print xy.shape
> return xy
>
>>>> reduce(dotp, [x, v, x.T, y]).shape
> (10000, 4)
> (10000, 10000)
> (10000, 3)
> (10000, 3)
>
>>>> def dotTp(x, y):
> xy = np.dot(x.T,y.T)
> print xy.shape
> return xy.T
>
>>>> reduce(dotTp, [x, v, x.T, y][::-1]).shape
> (3, 4)
> (3, 4)
> (3, 10000)
> (10000, 3)
>
> Josef
>
>>
>> IIRC, for reordering I looked at this
>> http://www.mathworks.com/matlabcentral/fileexchange/27950-mmtimes-matrix-chain-product
>>
>> Josef
>> (don't make it too easy for people to shoot themselves in ...)
>>
>>>
>>>
>>> Best,
>>> Stefan
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
More information about the NumPy-Discussion
mailing list