> Hi,
> I often need to compute the equivalent of
> np.diag(np.dot(A, B)).
> Computing np.dot(A, B) is highly inefficient if you only need the diagonal
> entries. Two more efficient ways of computing the same thing are
> np.sum(A * B.T, axis=1)
> and
> np.einsum("ij,ji->i", A, B).
> The first can allocate quite a lot of temporary memory.
> The second can be quite cryptic for someone not familiar with einsum.
> I assume that einsum does not compute np.dot(A, B), but I haven't verified.
> Since this is is quite a recurrent pattern, I was wondering if it would be
> worth adding a dedicated function to NumPy and SciPy's sparse module. A
> possible name would be "diagdot". The best performance would be obtained
> when A is C-style and B fortran-style.

Does your implementation use BLAS, or is just a a wrapper around einsum ?

