[Numpy-discussion] On responding to dubious ideas (was: Re: Advanced indexing: "fancy" vs. orthogonal)

josef.pktd at gmail.com josef.pktd at gmail.com
Wed Apr 8 14:24:26 EDT 2015

On Wed, Apr 8, 2015 at 1:38 PM, Robert Kern <robert.kern at gmail.com> wrote:
> On Wed, Apr 8, 2015 at 2:06 AM, Nathaniel Smith <njs at pobox.com> wrote:
>> On Apr 5, 2015 7:04 AM, "Robert Kern" <robert.kern at gmail.com> wrote:
>> >
>> > On Sat, Apr 4, 2015 at 10:38 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> > >
>> > > On Apr 4, 2015 4:12 AM, "Todd" <toddrjen at gmail.com> wrote:
>> > > >
>> > > > There was no break as large as this. In fact I would say this is
>> > > > even a larger change than any individual change we saw in the python 2 to 3
>> > > > switch.  The basic mechanics of indexing are just too fundamental and touch
>> > > > on too many things to make this sort of change feasible.
>> > >
>> > > I'm afraid I'm not clever enough to know how large or feasible a
>> > > change is without even seeing the proposed change.
>> >
>> > It doesn't take any cleverness. The change in question was to make the
>> > default indexing semantics to orthogonal indexing. No matter the details of
>> > the ultimate proposal to achieve that end, it has known minimum
>> > consequences, at least in the broad outline. Current documentation and books
>> > become obsolete for a fundamental operation. Current code must be modified
>> > by some step to continue working. These are consequences inherent in the
>> > end, not just the means to the end; we don't need a concrete proposal in
>> > front of us to know what they are. There are ways to mitigate these
>> > consequences, but there are no silver bullets that eliminate them. And we
>> > can compare those consequences to approaches like Jaime's that achieve a
>> > majority of the benefits of such a change without any of the negative
>> > consequences. That comparison does not bode well for any proposal.
>> Ok, let me try to make my point another way.
>> I don't actually care at this stage in the discussion whether the change
>> is ultimately viable. And I don't think you should either. (For values of
>> "you" that includes everyone in the discussion, not picking on Robert in
>> particular :-).)
>> My point is that rational, effective discussion requires giving ideas room
>> to breath. Sometimes ideas turn out to be not as bad as they looked.
>> Sometimes it turns out that they are, but there's some clever tweak that
>> gives you 95% of the benefits for 5% of the cost. Sometimes you generate a
>> better understanding of the tradeoffs that subsequently informs later design
>> decisions. Sometimes working through the details makes both sides realize
>> that there's a third option that solves both their problems. Sometimes you
>> merely get a very specific understanding of why the whole approach is
>> unreasonable that you can then, say, take to the pandas and netcdf
>> developers as evidence of that you made a good faith effort and ask them to
>> meet you half way. And all these things require understanding the specifics
>> of what *exactly* works or doesn't work about about idea. IMHO, it's
>> extremely misleading at this stage to make any assertion about whether
>> Jaime's approach gives the "majority of benefits of such a change" is
>> extremely misleading at this stage: not because it's wrong, but because it
>> totally short-circuits the discussion about what benefits we care about.
>> Jaime's patch certainly has merits, but does it help us make numpy and
>> pandas/netcdf's more compatible? Does it make it easier for Eric to teach?
>> Those are some specific merits that we might care about a lot, and for which
>> Jaime's patch may or may not help much. But that kind of nuance gets lost
>> when we jump straight to debating thumbs-up versus thumbs-down.
> And we can get all of that discussion from discussing Jaime's proposal. I
> would argue that we will get better, more focused discussion from it since
> it is actually a concrete proposal and not just a wish that numpy's indexing
> semantics were something else. I think that a full airing and elaboration of
> Jaime's proposal (as the final PR should look quite different than the
> initial one to incorporate the what is found in the discussion) will give us
> a satisficing solution. I certainly think that that is *more likely* to
> arrive at a satisficing solution than an attempt to change the default
> indexing semantics. I can name specific improvements that would specifically
> address the concerns you named if you would like. Maybe it won't be *quite*
> as good (for some parties) than if Numeric chose orthogonal indexing from
> the get-go, but it will likely be much better for everyone than if numpy
> broke backward compatibility on this feature now.
>> I cross-my-heart promise that under the current regime, no PR breaking
>> fancy indexing would ever get anywhere *near* numpy master without
>> *extensive* discussion and warnings on the list. The core devs just spent
>> weeks quibbling about whether a PR that adds a new field to the end of the
>> dtype struct would break ABI backcompat (we're now pretty sure it doesn't),
>> and the current standard we enforce is that every PR that touches public API
>> needs a list discussion, even minor extensions with no compatibility issues
>> at all. No one is going to sneak anything by anyone.
> That is not the issue. Ralf asked you not to invite such PRs in the first
> place. No one thinks that such a PR would get "snuck" in. That's not
> anyone's concern.
>> Plus, I dunno, our current approach to discussions just seems to make
>> things hostile and shouty and unpleasant. If a grad student or junior
>> colleague comes to you with an idea where you see some potentially critical
>> flaw, do you yell THAT WILL NEVER WORK and kick them out of your office? Or,
>> do you maybe ask a few leading questions and see where they go?
>> I think things will work better if the next time something like this comes
>> up, *one* person just says "hmm, interesting idea, but the backcompat issues
>> seem pretty severe; do you have any ideas about how to mitigate that?", and
>> then we let that point be taken as having been made and see where the
>> discussion goes. Maybe we can all give it a try?
> You do remember that I said we should be "politely considering [...]
> proposals people send us uninvited", right? The "politely" was a key part of
> that. Prospectively inviting backwards-incompatible proposals for a full
> airing goes beyond this.

If a suggestion like changing the default indexing behavior and
dropping fancy indexing has a ex ante chance of succeeding of less
than 0.1%, then we should say so.
Adding an improved additional features is then a useful alternative,
and a better way to spend our or your time.

A while ago we had the request on the mailing list to make numpy
broadcasting behavior optional, the discussion "died" pretty fast.

Fancy indexing and similar is a great feature (even if not many said
so in the thread) and as far as I can tell it is heavily entrenched in
the existing usage of numpy.

You can always discuss proposals, as long as it is clear that these
are low probability events.


> --
> Robert Kern
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

More information about the NumPy-Discussion mailing list