[Numpy-discussion] Consider improving numpy.outer's behavior with zero-dimensional vectors

Fri Apr 17 12:16:01 EDT 2015

On Fri, Apr 17, 2015 at 12:09 PM, <josef.pktd at gmail.com> wrote:

> On Fri, Apr 17, 2015 at 11:22 AM, Neil Girdhar <mistersheik at gmail.com>
> wrote:
> >
> >
> > On Fri, Apr 17, 2015 at 10:47 AM, <josef.pktd at gmail.com> wrote:
> >>
> >> On Fri, Apr 17, 2015 at 10:07 AM, Sebastian Berg
> >> <sebastian at sipsolutions.net> wrote:
> >> > On Do, 2015-04-16 at 15:28 -0700, Matthew Brett wrote:
> >> >> Hi,
> >> >>
> >> > <snip>
> >> >>
> >> >> So, how about a slight modification of your proposal?
> >> >>
> >> >> 1) Raise deprecation warning for np.outer for non 1D arrays for a few
> >> >> versions, with depraction in favor of np.multiply.outer, then
> >> >> 2) Raise error for np.outer on non 1D arrays
> >> >>
> >> >
> >> > I think that was Neil's proposal a bit earlier, too. +1 for it in any
> >> > case, since at least for the moment I doubt outer is used a lot for
> non
> >> > 1-d arrays. Possible step 3) make it work on higher dims after a long
> >> > period.
> >>
> >> sounds ok to me
> >>
> >> Some random comments of what I remember or guess in terms of usage
> >>
> >> I think there are at most very few np.outer usages with 2d or higher
> >> dimension.
> >> (statsmodels has two models that switch between 2d and 1d
> >> parameterization where we don't use outer but it has similar
> >> characteristics. However, we need to control the ravel order, which
> >> IIRC is Fortran)
> >>
> >> The current behavior of 0-D scalars in the initial post might be
> >> useful if a numpy function returns a scalar instead of a 1-D array in
> >> size=1. np.diag which is a common case, doesn't return a scalar (in my
> >> version of numpy).
> >>
> >> I don't know any use case where I would ever want to have the 2d
> >> behavior of np.multiply.outer.
> >
>
> I only understand part of your example, but it looks similar to what
> we are doing in statsmodels.
>
> >
> > My use case is pretty simple.  Given an input vector x, and a weight
> matrix
> > W, and a model y=Wx, I calculate the gradient of the loss L with respect
> W.
> > It is the outer product of x with the vector of gradients dL/dy.  So the
> > code is simply:
> >
> > W -= outer(x, dL_by_dy)
>
> if you sum/subtract over all the values, isn't this the same as
> np.dot(x, dL_by_dy)
>
>
What?  Matrix subtraction is element-wise:

In [1]: x = np.array([2,3,4])

In [2]: dL_by_dy = np.array([7,9])

In [5]: W = np.zeros((3, 2))

In [6]: W -= np.outer(x, dL_by_dy)

In [7]: W
Out[7]:
array([[-14., -18.],
       [-21., -27.],
       [-28., -36.]])

>
> > Sometimes, I have some x_indices and y_indices.  Now I want to do:
> >
> > W[x_indices, y_indices] -= outer(x[x_indices], dL_by_dy[y_indices])
> >
> > Unfortunately, if x_indices or y_indices are "int" or slice in some way
> that
> > removes a dimension, the left side will have fewer dimensions than the
> > right.  np.multipy.outer does the right thing without the ugly cases:
> >
> > if isinstance(x_indices, int): … # ugly hacks follow.
>
> My usual hacks are either to use np.atleast_1d or np.atleast_1d or
> np.squeeze if there is shape mismatch in some cases.
>

Yes, but in this case, the left side is the problem, which has too few
dimensions.  So atleast_1d doesn't work.  I was conditionally squeezing,
but that is extremely ugly.  Especially if you're conditionally squeezing
based on both x_indices and y_indices.

>
> >
> >> I guess we will or would have applications for outer along an axis,
> >> for example if x.shape = (100, 10), then we have
> >> x[:,None, :] * x[:, :, None]     (I guess)
> >> Something like this shows up reasonably often in econometrics as
> >> "Outer Product". However in most cases we can avoid constructing this
> >> matrix and get the final results in a more memory efficient or faster
> >> way.
> >> (example an array of covariance matrices)
> >
> >
> > Not sure I see this.  outer(a, b) should return something that has shape:
> > (a.shape + b.shape).  If you're doing it "along an axis", you mean you're
> > reshuffling the resulting shape vector?
>
> No I'm not reshaping the full tensor product.
>
> It's a vectorized version of looping over independent outer products
>
> np.array([outer(xi, yi) for xi,yi in zip(x, y)])
> (which I would never use with outer)
>
> but I have code that works similar for a reduce (or reduce_at) loop over
> this.
>
> Josef
>
>
> >>
> >>
> >> Josef
> >>
> >>
> >>
> >>
> >> >
> >> > - Sebastian
> >> >
> >> >
> >> >> Best,
> >> >>
> >> >> Matthew
> >> >> _______________________________________________
> >> >> NumPy-Discussion mailing list
> >> >> NumPy-Discussion at scipy.org
> >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >> >>
> >> >
> >> >
> >> > _______________________________________________
> >> > NumPy-Discussion mailing list
> >> > NumPy-Discussion at scipy.org
> >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >> >
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150417/e1d700ac/attachment.html>