[Numpy-discussion] Proposed new feature for numpy.einsum: repeated output subscripts as diagonal

Pierre-Andre Noel noel.pierre.andre at gmail.com
Thu Aug 14 15:07:38 EDT 2014

(I created issue 4965 earlier today on this topic, and I have been 
advised to email to this mailing list to discuss whether it is a good 
idea or not. I include my original post as-is, followed by additional 

I think that the following new feature would make `numpy.einsum` even 
more powerful/useful/awesome than it already is. Moreover, the change 
should not interfere with existing code, it would preserve the 
"minimalistic" spirit of `numpy.einsum`, and the new functionality would 
integrate in a seamless/intuitive manner for the users.

In short, the new feature would allow for repeated subscripts to appear 
in the "output" part of the `subscripts` parameter (i.e., on the 
right-hand side of `->`). The corresponding dimensions in the resulting 
`ndarray` would only be filled along their diagonal, leaving the off 
diagonal entries to the default value for this `dtype` (typically zero). 
Note that the current behavior is to raise an exception when repeated 
output subscripts are being used.

This is simplest to describe with an example involving the dual behavior 
of `numpy.diag`.

# Extracting the diagonal of a 2-D array.
A = arange(16).reshape(4,4)
print(diag(A)) # Output: [ 0 5 10 15 ]
print(einsum('ii->i', A)) # Same as previous line (current behavior).

# Constructing a diagonal 2-D array.
v = arange(4)
print(diag(v)) # Output: [[0 0 0 0] [0 1 0 0] [0 0 2 0] [0 0 0 3]]
print(einsum('i->ii', v)) # New behavior would be same as previous line.
# The current behavior of the previous line is to raise an exception.

By opposition to `numpy.diag`, the approach generalizes to higher 
dimensions: `einsum('iii->i', A)` extracts the diagonal of a 3-D array, 
and `einsum('i->iii', v)` would build a diagonal 3-D array.

The proposed behavior really starts to shine in more intricate cases.

# Dummy values, these should be probabilities to make sense below.
P_w_ab = arange(24).reshape(3,2,4)
P_y_wxab = arange(144).reshape(3,3,2,2,4)

# With the proposed behavior, the following two lines should be equivalent.
P_xyz_ab = einsum('wab,xa,ywxab,zy->xyzab', P_w_ab, eye(2), P_y_wxab, 
also_P_xyz_ab = einsum('wab,ywaab->ayyab', P_w_ab, P_y_wxab)

If this is not convincing enough, replace `eye(2)` by 
`eye(P_w_ab.shape[1])` and replace `eye(3)` by `eye(P_y_wxab.shape[0])`, 
then imagine more dimensions and repeated indices... The new notation 
would allow for crisper codes and reduce the opportunities for dumb 

For those who wonder, the above computation amounts to 
$P(X=x,Y=y,Z=z|A=a,B=b) = \sum_w P(W=w|A=a,B=b) P(X=x|A=a) 
P(Y=y|W=w,X=x,A=a,B=b) P(Z=z|Y=y)$ with $P(X=x|A=a)=\delta_{xa}$ and 
$P(Z=z|Y=y)=\delta_{zy}$ (using LaTeX notation, and $\delta_{ij}$ is 
[Kronecker's delta](http://en.wikipedia.org/wiki/Kronecker_delta)).

(End of original post.)

I have been told by @jaimefrio that "The best way of getting a new 
feature into numpy is putting it in yourself." Hence, if discussions 
here do reveal that this is a good idea, then I may give a try at coding 
it myself. However, I currently know nothing of the inner workings of 
numpy/ndarray/einsum, and I have higher priorities right now. This means 
that it could take a long while before I contribute any code, if I ever 
do. Hence, if anyone feels like doing it, feel free to do so!

Also, I am aware that storing a lot of zeros in an `ndarray` may not, a 
priori, be a desirable avenue. However, there are times where you have 
to do it: think of `numpy.eye` as an example. In my case of application, 
I use such diagonal structures in the initialization of an `ndarray` 
which is later updated through an iterative process. After these 
iterations, most of the zeros will be gone. Do other people see a use 
for such capabilities?

Thank you for your time and have a nice day.


Pierre-André Noël

More information about the NumPy-Discussion mailing list