[Numpy-discussion] Is this a bug?

Wed Sep 17 01:51:10 EDT 2014

On Tue, Sep 16, 2014 at 4:32 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Tue, Sep 16, 2014 at 6:56 PM, Jaime Fernández del Río
> <jaime.frio at gmail.com> wrote:
> > On Tue, Sep 16, 2014 at 3:26 PM, Charles R Harris
> > <charlesr.harris at gmail.com> wrote:
> >>
> >> On Tue, Sep 16, 2014 at 2:51 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >>>
> >>> On Tue, Sep 16, 2014 at 4:31 PM, Jaime Fernández del Río
> >>> <jaime.frio at gmail.com> wrote:
> >>> > If it is a bug, it is an extended one, because it is the same
> behavior
> >>> > of
> >>> > einsum:
> >>> >
> >>> >>>> np.einsum('i,i', [1,1,1], [1])
> >>> > 3
> >>> >>>> np.einsum('i,i', [1,1,1], [1,1])
> >>> > Traceback (most recent call last):
> >>> >   File "<stdin>", line 1, in <module>
> >>> > ValueError: operands could not be broadcast together with remapped
> >>> > shapes
> >>> > [origi
> >>> > nal->remapped]: (3,)->(3,) (2,)->(2,)
> >>> >
> >>> > And I think it is a conscious design decision, there is a comment
> about
> >>> > broadcasting missing core dimensions here:
> >>> >
> >>> >
> >>> >
> https://github.com/numpy/numpy/blob/master/numpy/core/src/umath/ufunc_object.c#L1940
> >>>
> >>> "intentional" and "sensible" are not always the same thing :-). That
> >>> said, it isn't totally obvious to me what the correct behaviour for
> >>> einsum is in this case.
> >>>
> >>> > and the code makes it very explicit that input argument dimensions
> with
> >>> > the
> >>> > same label are broadcast to a common shape, see here:
> >>> >
> >>> >
> >>> >
> https://github.com/numpy/numpy/blob/master/numpy/core/src/umath/ufunc_object.c#L1956
> >>> >
> >>> > I kind of expect numpy to broadcast whenever possible, so this
> doesn't
> >>> > feel
> >>> > wrong to me.
> >>>
> >>> The case Chuck is talking about is like if we allowed matrix
> >>> multiplication between an array with shape (n, 1) with an array with
> >>> (k, m), because (n, 1) can be broadcast to (n, k). This feels VERY
> >>> wrong to me, will certainly hide many bugs, and is definitely not how
> >>> it works right now (for np.dot, anyway; apparently it does work that
> >>> way for the brand-new gufunc np.linalg.matrix_multiply, but this must
> >>> be an accident).
> >>>
> >>> > That said, it is hard to come up with convincing examples of how this
> >>> > behavior would be useful in any practical context. But changing
> >>> > something
> >>> > that has been working like that for so long seems like a risky thing.
> >>> > And I
> >>> > cannot come with a convincing example of why it would be harmful
> >>> > either.
> >>>
> >>> gufuncs are very new.
> >>>
> >>
> >> Or at least newly used. They've been sitting around for years with
> little
> >> use and less testing. This is probably (easily?) fixable as the shape
> of the
> >> operands is available.
> >>
> >> In [22]: [d.shape for d in nditer([[1,1,1], [[1,1,1]]*3]).operands]
> >> Out[22]: [(3,), (3, 3)]
> >>
> >> In [23]: [d.shape for d in nditer([[[1,1,1]], [[1,1,1]]*3]).operands]
> >> Out[23]: [(1, 3), (3, 3)]
> >>
> >
> > If we agree that it is broken, which I still am not fully sure of, then
> yes,
> > it is very easy to fix. I have been looking into that code quite a bit
> > lately, so I could patch something up pretty quick.
> >
> > Are we OK with the appending of size 1 dimensions to complete the core
> > dimensions? That is, should matrix_multiply([1,1,1], [[1],[1],[1]])
> work, or
> > should it complain about the first argument having less dimensions than
> the
> > core dimensions in the signature?
>
> I think that by default, gufuncs should definitely *not* allow this.
>
> Example case 1: qr can be applied equally well to a (1, n) array or an
> (n, 1) array, but with different results. If the user passes in an
> (n,) array, then how do we know which one they wanted?
>
> Example case 2: matrix multiplication, as you know :-), is a case
> where I do think we should allow for a bit more cleverness with the
> core dimensions... but the appropriate cleverness is much more subtle
> than just "prepend size 1 dimensions until things fit". Instead, for
> the first argument you need to prepend, for the second argument you
> need to append, and then you need to remove the corresponding
> dimensions from the output. Specific cases:
>
> # Your version gives:
> matmul([1, 1, 1], [[1], [1], [1]]).shape == (1, 1)
> # But this should be (1,) (try it with np.dot)
>
> # Your version gives:
> matmul([[1, 1, 1]], [1, 1, 1]) -> error, (1, 3) and (1, 3) are not
> conformable
> # But this should work (second argument should be treated as (3, 1), not
> (1, 3))
>
> So the default should be to be strict about core dimensions, unless
> explicitly requested otherwise by the person defining the gufunc.
>

#5057 now implements this behavior, which I agree is a sensible thing to
do. And it doesn't seem very likely that anyone (numpy tests aside!) is
expecting the old behavior to hold.

>
> > Lastly, there is an interesting side effect of the way this broadcasting
> is
> > handled: if a gufunc specifies a core dimension in an output argument
> only,
> > and an `out` kwarg is not passed in, then the output array will have that
> > core dimension set to be of size 1, e.g. if the signature of `f` is
> > '(),()->(a)', then f(1, 2).shape is (1,). This has always felt funny to
> me,
> > and I think that an unspecified dimension in an output array should
> either
> > be specified by a passed out array, or raise an error about an
> unspecified
> > core dimension or something like that. Does this sound right?
>
> Does this have any use cases? My vote is that we simply disallow this
> until we have concrete uses and can decide how to do it properly. That
> way there won't be any backcompat concerns to deal with later.
>

The obvious example is "compute all pairwise distances." You can define a
gufunc with signature (n,d)->(m) that does that, and wrap it in a Python
function that makes sure that it always gets called with an array with m =
n * (n - 1) for the last diemnsion of the out parameter. If you do not
specify the out array, it will create one with m = 1, or with the
implementation in #5057, m = -1, which I suppose will fail badly with a
cryptic, apparently unrelated error. I don't think either of these has any
practical use, and think there are only three sensible options here:

1. Raise an error if a core dimension wasn't specified by the passed arrays
(in the face of ambiguity...)
2. Do not allow dimensions in the output arrays that are not also in one of
the input arrays.
3. Provide a method for the gufunc to decide on its own what m should be
based on the other core dimensions.

I like the idea of 3, but we need a lot of discussion on whether it really
is a good idea, and what the best mechanism to implement it is.
I think 2 would be a mistake, unless something like 3 was in place, for
being too restrictive.
And I see 1 as the quickest short term solution to fix something that is
broken, or would be if gufuncs were being more extensively used out there.

Jaime
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140916/179eddc7/attachment.html>