I wrote a new function, einsum, which implements Einstein summation notation, and I'd like comments/thoughts from people who might be interested in this kind of thing. In testing it, it is also faster than many of NumPy's built-in functions, except for dot and inner. At the bottom of this email you can find the documentation blurb I wrote for it, and here are some timings: In [1]: import numpy as np In [2]: a = np.arange(25).reshape(5,5) In [3]: timeit np.einsum('ii', a) 100000 loops, best of 3: 3.45 us per loop In [4]: timeit np.trace(a) 100000 loops, best of 3: 9.8 us per loop In [5]: timeit np.einsum('ii->i', a) 1000000 loops, best of 3: 1.19 us per loop In [6]: timeit np.diag(a) 100000 loops, best of 3: 7 us per loop In [7]: b = np.arange(30).reshape(5,6) In [8]: timeit np.einsum('ij,jk', a, b) 10000 loops, best of 3: 11.4 us per loop In [9]: timeit np.dot(a, b) 100000 loops, best of 3: 2.8 us per loop In [10]: a = np.arange(10000.) In [11]: timeit np.einsum('i->', a) 10000 loops, best of 3: 22.1 us per loop In [12]: timeit np.sum(a) 10000 loops, best of 3: 25.5 us per loop -Mark The documentation: einsum(subscripts, *operands, out=None, dtype=None, order='K', casting='safe') Evaluates the Einstein summation convention on the operands. Using the Einstein summation convention, many common multi-dimensional array operations can be represented in a simple fashion. This function provides a way compute such summations. The best way to understand this function is to try the examples below, which show how many common NumPy functions can be implemented as calls to einsum. The subscripts string is a comma-separated list of subscript labels, where each label refers to a dimension of the corresponding operand. Repeated subscripts labels in one operand take the diagonal. For example, ``np.einsum('ii', a)`` is equivalent to ``np.trace(a)``. Whenever a label is repeated, it is summed, so ``np.einsum('i,i', a, b)`` is equivalent to ``np.inner(a,b)``. If a label appears only once, it is not summed, so ``np.einsum('i', a)`` produces a view of ``a`` with no changes. The order of labels in the output is by default alphabetical. This means that ``np.einsum('ij', a)`` doesn't affect a 2D array, while ``np.einsum('ji', a)`` takes its transpose. The output can be controlled by specifying output subscript labels as well. This specifies the label order, and allows summing to be disallowed or forced when desired. The call ``np.einsum('i->', a)`` is equivalent to ``np.sum(a, axis=-1)``, and ``np.einsum('ii->i', a)`` is equivalent to ``np.diag(a)``. It is also possible to control how broadcasting occurs using an ellipsis. To take the trace along the first and last axes, you can do ``np.einsum('i...i', a)``, or to do a matrix-matrix product with the left-most indices instead of rightmost, you can do ``np.einsum('ij...,jk...->ik...', a, b)``. When there is only one operand, no axes are summed, and no output parameter is provided, a view into the operand is returned instead of a new array. Thus, taking the diagonal as ``np.einsum('ii->i', a)`` produces a view. Parameters ---------- subscripts : string Specifies the subscripts for summation. operands : list of array_like These are the arrays for the operation. out : None or array If provided, the calculation is done into this array. dtype : None or data type If provided, forces the calculation to use the data type specified. Note that you may have to also give a more liberal ``casting`` parameter to allow the conversions. order : 'C', 'F', 'A', or 'K' Controls the memory layout of the output. 'C' means it should be Fortran contiguous. 'F' means it should be Fortran contiguous, 'A' means it should be 'F' if the inputs are all 'F', 'C' otherwise. 'K' means it should be as close to the layout as the inputs as is possible, including arbitrarily permuted axes. casting : 'no', 'equiv', 'safe', 'same_kind', 'unsafe' Controls what kind of data casting may occur. Setting this to 'unsafe' is not recommended, as it can adversely affect accumulations. 'no' means the data types should not be cast at all. 'equiv' means only byte-order changes are allowed. 'safe' means only casts which can preserve values are allowed. 'same_kind' means only safe casts or casts within a kind, like float64 to float32, are allowed. 'unsafe' means any data conversions may be done. Returns ------- output : ndarray The calculation based on the Einstein summation convention. See Also -------- dot, inner, outer, tensordot Examples -------- >>> a = np.arange(25).reshape(5,5) >>> b = np.arange(5) >>> c = np.arange(6).reshape(2,3) >>> np.einsum('ii', a) 60 >>> np.trace(a) 60 >>> np.einsum('ii->i', a) array([ 0, 6, 12, 18, 24]) >>> np.diag(a) array([ 0, 6, 12, 18, 24]) >>> np.einsum('ij,j', a, b) array([ 30, 80, 130, 180, 230]) >>> np.dot(a, b) array([ 30, 80, 130, 180, 230]) >>> np.einsum('ji', c) array([[0, 3], [1, 4], [2, 5]]) >>> c.T array([[0, 3], [1, 4], [2, 5]]) >>> np.einsum(',', 3, c) array([[ 0, 3, 6], [ 9, 12, 15]]) >>> np.multiply(3, c) array([[ 0, 3, 6], [ 9, 12, 15]]) >>> np.einsum('i,i', b, b) 30 >>> np.inner(b,b) 30 >>> np.einsum('i,j', np.arange(2)+1, b) array([[0, 1, 2, 3, 4], [0, 2, 4, 6, 8]]) >>> np.outer(np.arange(2)+1, b) array([[0, 1, 2, 3, 4], [0, 2, 4, 6, 8]]) >>> np.einsum('i...->', a) array([50, 55, 60, 65, 70]) >>> np.sum(a, axis=0) array([50, 55, 60, 65, 70]) >>> a = np.arange(60.).reshape(3,4,5) >>> b = np.arange(24.).reshape(4,3,2) >>> np.einsum('ijk,jil->kl', a, b) array([[ 4400., 4730.], [ 4532., 4874.], [ 4664., 5018.], [ 4796., 5162.], [ 4928., 5306.]]) >>> np.tensordot(a,b, axes=([1,0],[0,1])) array([[ 4400., 4730.], [ 4532., 4874.], [ 4664., 5018.], [ 4796., 5162.], [ 4928., 5306.]])