[Numpy-discussion] broadcasting behavior for 1.6 (was: Numpy 1.6 schedule)

Wes McKinney wesmckinn at gmail.com
Fri Mar 11 10:06:27 EST 2011


On Fri, Mar 11, 2011 at 9:57 AM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Fri, Mar 11, 2011 at 7:42 AM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>>
>>
>> On Fri, Mar 11, 2011 at 2:01 AM, Ralf Gommers
>> <ralf.gommers at googlemail.com> wrote:
>>>
>>> I'm just going through the very long 1.6 schedule thread to see what
>>> is still on the TODO list before a 1.6.x branch can be made. So I'll
>>> send a few separate mails, one for each topic.
>>>
>>> On Mon, Mar 7, 2011 at 8:30 PM, Francesc Alted <faltet at pytables.org>
>>> wrote:
>>> > A Sunday 06 March 2011 06:47:34 Mark Wiebe escrigué:
>>> >> I think it's ok to revert this behavior for backwards compatibility,
>>> >> but believe it's an inconsistent and unintuitive choice. In
>>> >> broadcasting, there are two operations, growing a dimension 1 -> n,
>>> >> and appending a new 1 dimension to the left. The behaviour under
>>> >> discussion in assignment is different from normal broadcasting in
>>> >> that only the second one is permitted. It is broadcasting the output
>>> >> to the input, rather than broadcasting the input to the output.
>>> >>
>>> >> Suppose a has shape (20,), b has shape (1,20), and c has shape
>>> >> (20,1). Then a+b has shape (1,20), a+c has shape (20,20), and b+c
>>> >> has shape (20,20).
>>> >>
>>> >> If we do "b[...] = a", a will be broadcast to match b by adding a 1
>>> >> dimension to the left. This is reasonable and consistent with
>>> >> addition.
>>> >>
>>> >> If we do "a[...]=b", under 1.5 rules, a will once again be broadcast
>>> >> to match b by adding a 1 dimension to the left.
>>> >>
>>> >> If we do "a[...]=c", we could broadcast both a and c together to the
>>> >> shape (20,20). This results in multiple assignments to each element
>>> >> of a, which is inconsistent. This is not analogous to a+c, but
>>> >> rather to np.add(c, c, out=a).
>>> >>
>>> >> The distinction is subtle, but the inconsistent behavior is harmless
>>> >> enough for assignment that keeping backwards compatibility seems
>>> >> reasonable.
>>> >
>>> > For what is worth, I also like the behaviour that Mark proposes, and
>>> > have updated tables test suite to adapt to this.  But I'm fine if it is
>>> > decided to revert to the previous behaviour.
>>>
>>> The conclusion on this topic, as I read the discussion, is that we
>>> need to keep backwards compatible behavior (even though the proposed
>>> change is more intuitive). Has backwards compatibility been fixed
>>> already?
>>>
>>
>> I don't think an official conclusion was reached, at least in so far as
>> numpy has an official anything ;) But this change does show up as an error
>> in one of the pandas tests, so it is likely to affect other folks as well.
>> Probably the route of least compatibility hassle is to revert to the old
>> behavior and maybe switch to the new behavior, which I prefer, for 2.0.
>>
>
> That said, apart from pandas and pytables, and the latter has been fixed,
> the new behavior doesn't seem to have much fallout. I think it actually
> exposes unoticed assumptions in code that slipped by because there was no
> consequence.
>
> Chuck
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>

I've fixed the pandas issue-- I'll put out a bugfix release whenever
NumPy 1.6 final is out. I don't suspect it will cause very many
problems (and those problems will--hopefully--be easy to fix).



More information about the NumPy-Discussion mailing list