[Numpy-discussion] einsum

Wed Jan 26 21:41:02 EST 2011

Nice function, and wonderful that it speeds some tasks up.

some feedback: the following notation is a little counter intuitive to me:
    >>> np.einsum('i...->', a)
    array([50, 55, 60, 65, 70])
    >>> np.sum(a, axis=0)
    array([50, 55, 60, 65, 70])
 Since there is nothing after the ->, I expected a scalar not a vector. I
might suggest 'i...->...'

Just noticed also a typo in the doc:
     order : 'C', 'F', 'A', or 'K'
        Controls the memory layout of the output. 'C' means it should
        be Fortran contiguous. 'F' means it should be Fortran contiguous,
should be changed to
    order : 'C', 'F', 'A', or 'K'
        Controls the memory layout of the output. 'C' means it should
        be C contiguous. 'F' means it should be Fortran contiguous,

Hope this helps,
Jonathan

On Wed, Jan 26, 2011 at 2:27 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:

> I wrote a new function, einsum, which implements Einstein summation
> notation, and I'd like comments/thoughts from people who might be interested
> in this kind of thing.
>
> In testing it, it is also faster than many of NumPy's built-in functions,
> except for dot and inner.  At the bottom of this email you can find the
> documentation blurb I wrote for it, and here are some timings:
>
> In [1]: import numpy as np
> In [2]: a = np.arange(25).reshape(5,5)
>
> In [3]: timeit np.einsum('ii', a)
> 100000 loops, best of 3: 3.45 us per loop
> In [4]: timeit np.trace(a)
> 100000 loops, best of 3: 9.8 us per loop
>
> In [5]: timeit np.einsum('ii->i', a)
> 1000000 loops, best of 3: 1.19 us per loop
> In [6]: timeit np.diag(a)
> 100000 loops, best of 3: 7 us per loop
>
> In [7]: b = np.arange(30).reshape(5,6)
>
> In [8]: timeit np.einsum('ij,jk', a, b)
> 10000 loops, best of 3: 11.4 us per loop
> In [9]: timeit np.dot(a, b)
> 100000 loops, best of 3: 2.8 us per loop
>
> In [10]: a = np.arange(10000.)
>
> In [11]: timeit np.einsum('i->', a)
> 10000 loops, best of 3: 22.1 us per loop
> In [12]: timeit np.sum(a)
> 10000 loops, best of 3: 25.5 us per loop
>
> -Mark
>
> The documentation:
>
>     einsum(subscripts, *operands, out=None, dtype=None, order='K',
> casting='safe')
>
>     Evaluates the Einstein summation convention on the operands.
>
>     Using the Einstein summation convention, many common multi-dimensional
>     array operations can be represented in a simple fashion.  This function
>     provides a way compute such summations.
>
>     The best way to understand this function is to try the examples below,
>     which show how many common NumPy functions can be implemented as
>     calls to einsum.
>
>     The subscripts string is a comma-separated list of subscript labels,
>     where each label refers to a dimension of the corresponding operand.
>     Repeated subscripts labels in one operand take the diagonal.  For
> example,
>     ``np.einsum('ii', a)`` is equivalent to ``np.trace(a)``.
>
>     Whenever a label is repeated, it is summed, so ``np.einsum('i,i', a,
> b)``
>     is equivalent to ``np.inner(a,b)``.  If a label appears only once,
>     it is not summed, so ``np.einsum('i', a)`` produces a view of ``a``
>     with no changes.
>
>     The order of labels in the output is by default alphabetical.  This
>     means that ``np.einsum('ij', a)`` doesn't affect a 2D array, while
>     ``np.einsum('ji', a)`` takes its transpose.
>
>     The output can be controlled by specifying output subscript labels
>     as well.  This specifies the label order, and allows summing to be
>     disallowed or forced when desired.  The call ``np.einsum('i->', a)``
>     is equivalent to ``np.sum(a, axis=-1)``, and
>     ``np.einsum('ii->i', a)`` is equivalent to ``np.diag(a)``.
>
>     It is also possible to control how broadcasting occurs using
>     an ellipsis.  To take the trace along the first and last axes,
>     you can do ``np.einsum('i...i', a)``, or to do a matrix-matrix
>     product with the left-most indices instead of rightmost, you can do
>     ``np.einsum('ij...,jk...->ik...', a, b)``.
>
>     When there is only one operand, no axes are summed, and no output
>     parameter is provided, a view into the operand is returned instead
>     of a new array.  Thus, taking the diagonal as ``np.einsum('ii->i', a)``
>     produces a view.
>
>     Parameters
>     ----------
>     subscripts : string
>         Specifies the subscripts for summation.
>     operands : list of array_like
>         These are the arrays for the operation.
>     out : None or array
>         If provided, the calculation is done into this array.
>     dtype : None or data type
>         If provided, forces the calculation to use the data type specified.
>         Note that you may have to also give a more liberal ``casting``
>         parameter to allow the conversions.
>     order : 'C', 'F', 'A', or 'K'
>         Controls the memory layout of the output. 'C' means it should
>         be Fortran contiguous. 'F' means it should be Fortran contiguous,
>         'A' means it should be 'F' if the inputs are all 'F', 'C'
> otherwise.
>         'K' means it should be as close to the layout as the inputs as
>         is possible, including arbitrarily permuted axes.
>     casting : 'no', 'equiv', 'safe', 'same_kind', 'unsafe'
>         Controls what kind of data casting may occur.  Setting this to
>         'unsafe' is not recommended, as it can adversely affect
> accumulations.
>         'no' means the data types should not be cast at all. 'equiv' means
>         only byte-order changes are allowed. 'safe' means only casts
>         which can preserve values are allowed. 'same_kind' means only
>         safe casts or casts within a kind, like float64 to float32, are
>         allowed.  'unsafe' means any data conversions may be done.
>
>     Returns
>     -------
>     output : ndarray
>         The calculation based on the Einstein summation convention.
>
>     See Also
>     --------
>     dot, inner, outer, tensordot
>
>
>     Examples
>     --------
>
>     >>> a = np.arange(25).reshape(5,5)
>     >>> b = np.arange(5)
>     >>> c = np.arange(6).reshape(2,3)
>
>     >>> np.einsum('ii', a)
>     60
>     >>> np.trace(a)
>     60
>
>     >>> np.einsum('ii->i', a)
>     array([ 0,  6, 12, 18, 24])
>     >>> np.diag(a)
>     array([ 0,  6, 12, 18, 24])
>
>     >>> np.einsum('ij,j', a, b)
>     array([ 30,  80, 130, 180, 230])
>     >>> np.dot(a, b)
>     array([ 30,  80, 130, 180, 230])
>
>     >>> np.einsum('ji', c)
>     array([[0, 3],
>            [1, 4],
>            [2, 5]])
>     >>> c.T
>     array([[0, 3],
>            [1, 4],
>            [2, 5]])
>
>     >>> np.einsum(',', 3, c)
>     array([[ 0,  3,  6],
>            [ 9, 12, 15]])
>     >>> np.multiply(3, c)
>     array([[ 0,  3,  6],
>            [ 9, 12, 15]])
>
>     >>> np.einsum('i,i', b, b)
>     30
>     >>> np.inner(b,b)
>     30
>
>     >>> np.einsum('i,j', np.arange(2)+1, b)
>     array([[0, 1, 2, 3, 4],
>            [0, 2, 4, 6, 8]])
>     >>> np.outer(np.arange(2)+1, b)
>     array([[0, 1, 2, 3, 4],
>            [0, 2, 4, 6, 8]])
>
>     >>> np.einsum('i...->', a)
>     array([50, 55, 60, 65, 70])
>     >>> np.sum(a, axis=0)
>     array([50, 55, 60, 65, 70])
>
>     >>> a = np.arange(60.).reshape(3,4,5)
>     >>> b = np.arange(24.).reshape(4,3,2)
>     >>> np.einsum('ijk,jil->kl', a, b)
>     array([[ 4400.,  4730.],
>            [ 4532.,  4874.],
>            [ 4664.,  5018.],
>            [ 4796.,  5162.],
>            [ 4928.,  5306.]])
>     >>> np.tensordot(a,b, axes=([1,0],[0,1]))
>     array([[ 4400.,  4730.],
>            [ 4532.,  4874.],
>            [ 4664.,  5018.],
>            [ 4796.,  5162.],
>            [ 4928.,  5306.]])
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>

-- 
Jonathan Rocher,
Enthought, Inc.
jrocher at enthought.com
1-512-536-1057
http://www.enthought.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110126/0a03a976/attachment.html>