Nice function, and wonderful that it speeds some tasks up. <br><br>some feedback: the following notation is a little counter intuitive to me:<br><div><div> >>> np.einsum('i...->', a)</div><div> array([50, 55, 60, 65, 70])</div>
<div> >>> np.sum(a, axis=0)</div>
<div> array([50, 55, 60, 65, 70])<br>
</div>
Since there is nothing after the ->, I expected a scalar not a vector. I might suggest 'i...->...'<div><br>Just noticed also a typo in the doc:<br>
order : 'C', 'F', 'A', or 'K'<br>
<div> Controls the memory layout of the output. 'C' means it should</div>
be Fortran contiguous. 'F' means it should be Fortran contiguous,<br>should be changed to <br>
order : 'C', 'F', 'A', or 'K'
<div> Controls the memory layout of the output. 'C' means it should</div>
<div> be C contiguous. 'F' means it should be Fortran contiguous,<br>
</div>
<br></div><br>Hope this helps,<br>Jonathan<br></div><br><div class="gmail_quote">On Wed, Jan 26, 2011 at 2:27 PM, Mark Wiebe <span dir="ltr"><<a href="mailto:mwwiebe@gmail.com" target="_blank">mwwiebe@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">I wrote a new function, einsum, which implements Einstein summation notation, and I'd like comments/thoughts from people who might be interested in this kind of thing.<div>
<br></div><div>In testing it, it is also faster than many of NumPy's built-in functions, except for dot and inner. At the bottom of this email you can find the documentation blurb I wrote for it, and here are some timings:</div>
<div><br></div><div><div>In [1]: import numpy as np</div><div>In [2]: a = np.arange(25).reshape(5,5)</div><div><br></div><div>In [3]: timeit np.einsum('ii', a)</div><div>100000 loops, best of 3: 3.45 us per loop</div>
<div>In [4]: timeit np.trace(a)</div><div>100000 loops, best of 3: 9.8 us per loop</div><div><br></div><div>In [5]: timeit np.einsum('ii->i', a)</div><div>1000000 loops, best of 3: 1.19 us per loop</div><div>In [6]: timeit np.diag(a)</div>
<div>100000 loops, best of 3: 7 us per loop</div><div><br></div><div>In [7]: b = np.arange(30).reshape(5,6)</div><div><br></div><div>In [8]: timeit np.einsum('ij,jk', a, b)</div><div>10000 loops, best of 3: 11.4 us per loop</div>
<div>In [9]: timeit np.dot(a, b)</div><div>100000 loops, best of 3: 2.8 us per loop</div><div><br></div><div>In [10]: a = np.arange(10000.)</div><div><div><br></div><div>In [11]: timeit np.einsum('i->', a)</div>
<div>10000 loops, best of 3: 22.1 us per loop</div><div>In [12]: timeit np.sum(a)</div><div>10000 loops, best of 3: 25.5 us per loop</div></div><div><br></div><div>-Mark</div><div><br></div><div>The documentation:</div><div>
<br></div><div><div> einsum(subscripts, *operands, out=None, dtype=None, order='K', casting='safe')</div><div><br></div><div> Evaluates the Einstein summation convention on the operands.</div><div><br>
</div><div> Using the Einstein summation convention, many common multi-dimensional</div><div> array operations can be represented in a simple fashion. This function</div><div> provides a way compute such summations.</div>
<div><br></div><div> The best way to understand this function is to try the examples below,</div><div> which show how many common NumPy functions can be implemented as</div><div> calls to einsum.</div><div> </div>
<div> The subscripts string is a comma-separated list of subscript labels,</div><div> where each label refers to a dimension of the corresponding operand.</div><div> Repeated subscripts labels in one operand take the diagonal. For example,</div>
<div> ``np.einsum('ii', a)`` is equivalent to ``np.trace(a)``.</div><div> </div><div> Whenever a label is repeated, it is summed, so ``np.einsum('i,i', a, b)``</div><div> is equivalent to ``np.inner(a,b)``. If a label appears only once,</div>
<div> it is not summed, so ``np.einsum('i', a)`` produces a view of ``a``</div><div> with no changes.</div><div><br></div><div> The order of labels in the output is by default alphabetical. This</div><div>
means that ``np.einsum('ij', a)`` doesn't affect a 2D array, while</div><div> ``np.einsum('ji', a)`` takes its transpose.</div><div><br></div><div> The output can be controlled by specifying output subscript labels</div>
<div> as well. This specifies the label order, and allows summing to be</div><div> disallowed or forced when desired. The call ``np.einsum('i->', a)``</div><div> is equivalent to ``np.sum(a, axis=-1)``, and</div>
<div> ``np.einsum('ii->i', a)`` is equivalent to ``np.diag(a)``.</div><div><br></div><div> It is also possible to control how broadcasting occurs using</div><div> an ellipsis. To take the trace along the first and last axes,</div>
<div> you can do ``np.einsum('i...i', a)``, or to do a matrix-matrix</div><div> product with the left-most indices instead of rightmost, you can do</div><div> ``np.einsum('ij...,jk...->ik...', a, b)``.</div>
<div><br></div><div> When there is only one operand, no axes are summed, and no output</div><div> parameter is provided, a view into the operand is returned instead</div><div> of a new array. Thus, taking the diagonal as ``np.einsum('ii->i', a)``</div>
<div> produces a view.</div><div><br></div><div> Parameters</div><div> ----------</div><div> subscripts : string</div><div> Specifies the subscripts for summation.</div><div> operands : list of array_like</div>
<div> These are the arrays for the operation.</div><div> out : None or array</div><div> If provided, the calculation is done into this array.</div><div> dtype : None or data type</div><div> If provided, forces the calculation to use the data type specified.</div>
<div> Note that you may have to also give a more liberal ``casting``</div><div> parameter to allow the conversions.</div><div> order : 'C', 'F', 'A', or 'K'</div><div> Controls the memory layout of the output. 'C' means it should</div>
<div> be Fortran contiguous. 'F' means it should be Fortran contiguous,</div><div> 'A' means it should be 'F' if the inputs are all 'F', 'C' otherwise.</div><div> 'K' means it should be as close to the layout as the inputs as</div>
<div> is possible, including arbitrarily permuted axes.</div><div> casting : 'no', 'equiv', 'safe', 'same_kind', 'unsafe'</div><div> Controls what kind of data casting may occur. Setting this to</div>
<div> 'unsafe' is not recommended, as it can adversely affect accumulations.</div><div> 'no' means the data types should not be cast at all. 'equiv' means</div><div> only byte-order changes are allowed. 'safe' means only casts</div>
<div> which can preserve values are allowed. 'same_kind' means only</div><div> safe casts or casts within a kind, like float64 to float32, are</div><div> allowed. 'unsafe' means any data conversions may be done.</div>
<div><br></div><div> Returns</div><div> -------</div><div> output : ndarray</div><div> The calculation based on the Einstein summation convention.</div><div><br></div><div> See Also</div><div> --------</div>
<div> dot, inner, outer, tensordot</div><div><br></div><div> </div><div> Examples</div><div> --------</div><div><br></div><div> >>> a = np.arange(25).reshape(5,5)</div><div> >>> b = np.arange(5)</div>
<div> >>> c = np.arange(6).reshape(2,3)</div><div><br></div><div> >>> np.einsum('ii', a)</div><div> 60</div><div> >>> np.trace(a)</div><div> 60</div><div><br></div><div> >>> np.einsum('ii->i', a)</div>
<div> array([ 0, 6, 12, 18, 24])</div><div> >>> np.diag(a)</div><div> array([ 0, 6, 12, 18, 24])</div><div><br></div><div> >>> np.einsum('ij,j', a, b)</div><div> array([ 30, 80, 130, 180, 230])</div>
<div> >>> np.dot(a, b)</div><div> array([ 30, 80, 130, 180, 230])</div><div><br></div><div> >>> np.einsum('ji', c)</div><div> array([[0, 3],</div><div> [1, 4],</div><div> [2, 5]])</div>
<div> >>> c.T</div><div> array([[0, 3],</div><div> [1, 4],</div><div> [2, 5]])</div><div><br></div><div> >>> np.einsum(',', 3, c)</div><div> array([[ 0, 3, 6],</div>
<div> [ 9, 12, 15]])</div><div> >>> np.multiply(3, c)</div><div> array([[ 0, 3, 6],</div><div> [ 9, 12, 15]])</div><div><br></div><div> >>> np.einsum('i,i', b, b)</div>
<div> 30</div><div> >>> np.inner(b,b)</div><div> 30</div><div><br></div><div> >>> np.einsum('i,j', np.arange(2)+1, b)</div><div> array([[0, 1, 2, 3, 4],</div><div> [0, 2, 4, 6, 8]])</div>
<div> >>> np.outer(np.arange(2)+1, b)</div><div> array([[0, 1, 2, 3, 4],</div><div> [0, 2, 4, 6, 8]])</div><div><br></div><div> >>> np.einsum('i...->', a)</div><div> array([50, 55, 60, 65, 70])</div>
<div> >>> np.sum(a, axis=0)</div><div> array([50, 55, 60, 65, 70])</div><div><br></div><div> >>> a = np.arange(60.).reshape(3,4,5)</div><div> >>> b = np.arange(24.).reshape(4,3,2)</div>
<div> >>> np.einsum('ijk,jil->kl', a, b)</div><div> array([[ 4400., 4730.],</div><div> [ 4532., 4874.],</div><div> [ 4664., 5018.],</div><div> [ 4796., 5162.],</div>
<div> [ 4928., 5306.]])</div><div> >>> np.tensordot(a,b, axes=([1,0],[0,1]))</div><div> array([[ 4400., 4730.],</div><div> [ 4532., 4874.],</div><div> [ 4664., 5018.],</div>
<div> [ 4796., 5162.],</div><div> [ 4928., 5306.]])</div></div><div><br></div></div>
<br>_______________________________________________<br>
NumPy-Discussion mailing list<br>
<a href="mailto:NumPy-Discussion@scipy.org" target="_blank">NumPy-Discussion@scipy.org</a><br>
<a href="http://mail.scipy.org/mailman/listinfo/numpy-discussion" target="_blank">http://mail.scipy.org/mailman/listinfo/numpy-discussion</a><br>
<br></blockquote></div><br><br clear="all"><br>-- <br>Jonathan Rocher,<br><font color="#888888"><span>Enthought</span>, Inc.<br>
<a href="mailto:jrocher@enthought.com" target="_blank">jrocher@<span>enthought</span>.com</a><br>
1-512-536-1057<br>
<a href="http://www.enthought.com/" target="_blank">http://www.<span>enthought</span>.com</a></font><br><div style="display: inline;"></div><br>