[Numpy-discussion] Proposed new feature for numpy.einsum: repeated output subscripts as diagonal

Wed Aug 20 09:26:09 EDT 2014

Thanks all for the feedback!

So there appears to be interest for this feature, and I think that I can 
implement it. However, it may take a while before I do so: I have other 
priorities right now.

In view of jaimefrio's comment on 
https://github.com/numpy/numpy/issues/4965 as well as Eelco 
Hoogendoorn's reply above, here is how I currently intend to implement 
the feature.

1. Implement a `diag_view` function that uses strides to make a view. 
The function would use subscripts in a way very similar to `einsum`, 
except that no commas are allowed and all indices appearing on one side 
of `->` must also appear on the other side. Like the current `einsum`, 
indices on the right-hand side of `->` cannot be repeated. For example, 
`B=diag_view('iij->ij',A)` returns a 2D view `B` of the 3D array `A` 
where the off-diagonal elements in the first two dimensions of `A` are 
inaccessible in `B`.

2. The edits to `einsum` itself should be minimal. For the purpose of 
the following, suppose that the indices have the form `lhs+'->'+rhs`, 
where `lhs` and `rhs` are character strings. To make sure that the 
current behavior of `einsum` is not slowed down nor broken by the new 
functionality, I intend to limit edits to the point where an error would 
be raised due to repeated indices in `rhs`. The following outlines what 
would replace the current error-raising.

     2.1 Extract from `rhs` the first occurrences of each indices; call 
that `rhs_first_oc`.

     2.2 If no `out` has been provided to `einsum`, allocate a zeroed 
out `ndarray` of appropriate size, including off-diagonal entries; call 
that `full_out`. If an `out` was provided to `einsum`, set `full_out=out`.

     2.3 Set `diag_out=diag_view(rhs+'->'+rhs_first_oc,full_out)`.

     2.4 Call `einsum(lhs+'->'+rhs_first_oc, [...], out=diag_out)`. This 
call is recursive, but the recursion should stop there.

     2.5 Return `full_out`.

Note that if an `out` is provided to `einsum`, the off-diagonal entries 
are not zeroed out. This should be a documented "feature" of `einsum`.

A disadvantage of this approach is that the indices are parsed 2-4 
times, depending how you count. However, for large `ndarray`, the 
bottleneck won't be there anyway.

Thanks again!

Pierre-André Noël

On 08/15/2014 03:09 PM, Eelco Hoogendoorn wrote:
> here is a snippet I extracted from a project with similar aims 
> (integrating the functionality of einsum and numexpr, actually)
>
> Not much to it, but in case someone needs a reminder on how to use 
> striding tricks:
> http://pastebin.com/kQNySjcj
>
>
> On Fri, Aug 15, 2014 at 5:20 PM, Eelco Hoogendoorn 
> <hoogendoorn.eelco at gmail.com <mailto:hoogendoorn.eelco at gmail.com>> wrote:
>
>     Well, there is the numpy-API C level, and then there is the arcane
>     macro C level. The two might as well be a completely different
>     language.
>
>     Indeed, it should be doing something similar for the inputs.
>     Actually, I think I wrote a wrapper around einsum/numexpr once
>     that performed this generalized indexing once... ill see if I can
>     dig that up.
>
>
>     On Fri, Aug 15, 2014 at 5:01 PM, Sebastian Berg
>     <sebastian at sipsolutions.net <mailto:sebastian at sipsolutions.net>>
>     wrote:
>
>         On Fr, 2014-08-15 at 16:42 +0200, Eelco Hoogendoorn wrote:
>         > Agreed; this addition occurred to me as well. Note that the
>         > implemenatation should be straightforward: just allocate an
>         enlarged
>         > array, use some striding logic to construct the relevant
>         view, and let
>         > einsums internals act on the view. hopefully, you wont even
>         have to
>         > touch the guts of einsum at the C level, because id say that
>         isn't for
>         > the faint of heart...
>         >
>
>         I am not sure if einsum isn't pure C :). But even if, it
>         should be doing
>         something identical already for duplicate indices on the inputs...
>
>         - Sebastian
>
>         >
>         > On Fri, Aug 15, 2014 at 3:53 PM, Sebastian Berg
>         > <sebastian at sipsolutions.net
>         <mailto:sebastian at sipsolutions.net>> wrote:
>         >         On Do, 2014-08-14 at 12:42 -0700, Stephan Hoyer wrote:
>         >         > I think this would be very nice addition.
>         >         >
>         >         >
>         >         > On Thu, Aug 14, 2014 at 12:21 PM, Benjamin Root
>         >         <ben.root at ou.edu <mailto:ben.root at ou.edu>>
>         >         > wrote:
>         >         >         You had me at Kronecker delta... :-)  +1
>         >         >
>         >
>         >
>         >         Sounds good to me. I don't see a reason for not
>         relaxing the
>         >         restriction, unless there is some technical issue,
>         but I doubt
>         >         that.
>         >
>         >         - Sebastian
>         >
>         >         >
>         >         >
>         >         >         On Thu, Aug 14, 2014 at 3:07 PM,
>         Pierre-Andre Noel
>         >         >         <noel.pierre.andre at gmail.com
>         <mailto:noel.pierre.andre at gmail.com>> wrote:
>         >         >                 (I created issue 4965 earlier
>         today on this
>         >         topic, and
>         >         >                 I have been
>         >         >                 advised to email to this mailing
>         list to
>         >         discuss
>         >         >                 whether it is a good
>         >         >                 idea or not. I include my original
>         post
>         >         as-is,
>         >         >                 followed by additional
>         >         >                 comments.)
>         >         >
>         >         >                 I think that the following new
>         feature would
>         >         make
>         >         >  `numpy.einsum` even
>         >         >                 more powerful/useful/awesome than
>         it already
>         >         is.
>         >         >                 Moreover, the change
>         >         >                 should not interfere with existing
>         code, it
>         >         would
>         >         >                 preserve the
>         >         >  "minimalistic" spirit of `numpy.einsum`, and
>         >         the new
>         >         >  functionality would
>         >         >                 integrate in a seamless/intuitive
>         manner for
>         >         the
>         >         >                 users.
>         >         >
>         >         >                 In short, the new feature would
>         allow for
>         >         repeated
>         >         >                 subscripts to appear
>         >         >                 in the "output" part of the
>         `subscripts`
>         >         parameter
>         >         >                 (i.e., on the
>         >         >                 right-hand side of `->`). The
>         corresponding
>         >         dimensions
>         >         >                 in the resulting
>         >         >                 `ndarray` would only be filled
>         along their
>         >         diagonal,
>         >         >                 leaving the off
>         >         >                 diagonal entries to the default
>         value for
>         >         this `dtype`
>         >         >                 (typically zero).
>         >         >                 Note that the current behavior is
>         to raise
>         >         an
>         >         >                 exception when repeated
>         >         >                 output subscripts are being used.
>         >         >
>         >         >                 This is simplest to describe with
>         an example
>         >         involving
>         >         >                 the dual behavior
>         >         >                 of `numpy.diag`.
>         >         >
>         >         >                 ```python
>         >         >                 # Extracting the diagonal of a 2-D
>         array.
>         >         >                 A = arange(16).reshape(4,4)
>         >         >  print(diag(A)) # Output: [ 0 5 10 15 ]
>         >         >  print(einsum('ii->i', A)) # Same as previous
>         >         line
>         >         >                 (current behavior).
>         >         >
>         >         >                 # Constructing a diagonal 2-D array.
>         >         >                 v = arange(4)
>         >         >  print(diag(v)) # Output: [[0 0 0 0] [0 1 0
>         >         0] [0 0 2
>         >         >                 0] [0 0 0 3]]
>         >         >  print(einsum('i->ii', v)) # New behavior
>         >         would be same
>         >         >                 as previous line.
>         >         >                 # The current behavior of the
>         previous line
>         >         is to
>         >         >                 raise an exception.
>         >         >                 ```
>         >         >
>         >         >                 By opposition to `numpy.diag`, the
>         approach
>         >         >                 generalizes to higher
>         >         >                 dimensions: `einsum('iii->i', A)`
>         extracts
>         >         the
>         >         >                 diagonal of a 3-D array,
>         >         >                 and `einsum('i->iii', v)` would
>         build a
>         >         diagonal 3-D
>         >         >                 array.
>         >         >
>         >         >                 The proposed behavior really
>         starts to shine
>         >         in more
>         >         >                 intricate cases.
>         >         >
>         >         >                 ```python
>         >         >                 # Dummy values, these should be
>         >         probabilities to make
>         >         >                 sense below.
>         >         >                 P_w_ab = arange(24).reshape(3,2,4)
>         >         >                 P_y_wxab =
>         arange(144).reshape(3,3,2,2,4)
>         >         >
>         >         >                 # With the proposed behavior, the
>         following
>         >         two lines
>         >         >                 should be equivalent.
>         >         >                 P_xyz_ab =
>         einsum('wab,xa,ywxab,zy->xyzab',
>         >         P_w_ab,
>         >         >                 eye(2), P_y_wxab,
>         >         >                 eye(3))
>         >         >  also_P_xyz_ab = einsum('wab,ywaab->ayyab',
>         >         P_w_ab,
>         >         >                 P_y_wxab)
>         >         >                 ```
>         >         >
>         >         >                 If this is not convincing enough,
>         replace
>         >         `eye(2)` by
>         >         >  `eye(P_w_ab.shape[1])` and replace `eye(3)`
>         >         by
>         >         >  `eye(P_y_wxab.shape[0])`,
>         >         >                 then imagine more dimensions and
>         repeated
>         >         indices...
>         >         >                 The new notation
>         >         >                 would allow for crisper codes and
>         reduce the
>         >         >  opportunities for dumb
>         >         >                 mistakes.
>         >         >
>         >         >                 For those who wonder, the above
>         computation
>         >         amounts to
>         >         >  $P(X=x,Y=y,Z=z|A=a,B=b) = \sum_w P(W=w|
>         >         A=a,B=b) P(X=x|
>         >         >                 A=a)
>         >         >  P(Y=y|W=w,X=x,A=a,B=b) P(Z=z|Y=y)$ with
>         >         $P(X=x|A=a)=
>         >         >                 \delta_{xa}$ and
>         >         >  $P(Z=z|Y=y)=\delta_{zy}$ (using LaTeX
>         >         notation, and
>         >         >  $\delta_{ij}$ is
>         >         >                 [Kronecker's
>         >         >
>         >         
>         delta](http://en.wikipedia.org/wiki/Kronecker_delta)
>         <http://en.wikipedia.org/wiki/Kronecker_delta%29>).
>         >         >
>         >         >                 (End of original post.)
>         >         >
>         >         >                 I have been told by @jaimefrio
>         that "The
>         >         best way of
>         >         >                 getting a new
>         >         >                 feature into numpy is putting it in
>         >         yourself." Hence,
>         >         >                 if discussions
>         >         >                 here do reveal that this is a good
>         idea,
>         >         then I may
>         >         >                 give a try at coding
>         >         >                 it myself. However, I currently
>         know nothing
>         >         of the
>         >         >                 inner workings of
>         >         >  numpy/ndarray/einsum, and I have higher
>         >         priorities
>         >         >                 right now. This means
>         >         >                 that it could take a long while
>         before I
>         >         contribute
>         >         >                 any code, if I ever
>         >         >                 do. Hence, if anyone feels like
>         doing it,
>         >         feel free to
>         >         >                 do so!
>         >         >
>         >         >                 Also, I am aware that storing a
>         lot of zeros
>         >         in an
>         >         >                 `ndarray` may not, a
>         >         >                 priori, be a desirable avenue.
>         However,
>         >         there are
>         >         >                 times where you have
>         >         >                 to do it: think of `numpy.eye` as an
>         >         example. In my
>         >         >                 case of application,
>         >         >                 I use such diagonal structures in the
>         >         initialization
>         >         >                 of an `ndarray`
>         >         >                 which is later updated through an
>         iterative
>         >         process.
>         >         >                 After these
>         >         >                 iterations, most of the zeros will
>         be gone.
>         >         Do other
>         >         >                 people see a use
>         >         >                 for such capabilities?
>         >         >
>         >         >                 Thank you for your time and have a
>         nice day.
>         >         >
>         >         >                 Sincerely,
>         >         >
>         >         >                 Pierre-André Noël
>         >         >
>         > _______________________________________________
>         >         >  NumPy-Discussion mailing list
>         >         > NumPy-Discussion at scipy.org
>         <mailto:NumPy-Discussion at scipy.org>
>         >         >
>         > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>         >         >
>         >         >
>         >         >
>         >         >  _______________________________________________
>         >         >         NumPy-Discussion mailing list
>         >         > NumPy-Discussion at scipy.org
>         <mailto:NumPy-Discussion at scipy.org>
>         >         >
>         > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>         >         >
>         >         >
>         >         >
>         >         > _______________________________________________
>         >         > NumPy-Discussion mailing list
>         >         > NumPy-Discussion at scipy.org
>         <mailto:NumPy-Discussion at scipy.org>
>         >         >
>         http://mail.scipy.org/mailman/listinfo/numpy-discussion
>         >
>         >
>         >  _______________________________________________
>         >         NumPy-Discussion mailing list
>         > NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>         > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>         >
>         >
>         >
>         > _______________________________________________
>         > NumPy-Discussion mailing list
>         > NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>         > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>         _______________________________________________
>         NumPy-Discussion mailing list
>         NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>         http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140820/931891a9/attachment.html>