[Numpy-discussion] On responding to dubious ideas (was: Re: Advanced indexing: "fancy" vs. orthogonal)

Tue Apr 7 21:06:04 EDT 2015

On Apr 5, 2015 7:04 AM, "Robert Kern" <robert.kern at gmail.com> wrote:
>
> On Sat, Apr 4, 2015 at 10:38 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >
> > On Apr 4, 2015 4:12 AM, "Todd" <toddrjen at gmail.com> wrote:
> > >
> > > There was no break as large as this. In fact I would say this is even
a larger change than any individual change we saw in the python 2 to 3
switch.  The basic mechanics of indexing are just too fundamental and touch
on too many things to make this sort of change feasible.
> >
> > I'm afraid I'm not clever enough to know how large or feasible a change
is without even seeing the proposed change.
>
> It doesn't take any cleverness. The change in question was to make the
default indexing semantics to orthogonal indexing. No matter the details of
the ultimate proposal to achieve that end, it has known minimum
consequences, at least in the broad outline. Current documentation and
books become obsolete for a fundamental operation. Current code must be
modified by some step to continue working. These are consequences inherent
in the end, not just the means to the end; we don't need a concrete
proposal in front of us to know what they are. There are ways to mitigate
these consequences, but there are no silver bullets that eliminate them.
And we can compare those consequences to approaches like Jaime's that
achieve a majority of the benefits of such a change without any of the
negative consequences. That comparison does not bode well for any proposal.

Ok, let me try to make my point another way.

I don't actually care at this stage in the discussion whether the change is
ultimately viable. And I don't think you should either. (For values of
"you" that includes everyone in the discussion, not picking on Robert in
particular :-).)

My point is that rational, effective discussion requires giving ideas room
to breath. Sometimes ideas turn out to be not as bad as they looked.
Sometimes it turns out that they are, but there's some clever tweak that
gives you 95% of the benefits for 5% of the cost. Sometimes you generate a
better understanding of the tradeoffs that subsequently informs later
design decisions. Sometimes working through the details makes both sides
realize that there's a third option that solves both their problems.
Sometimes you merely get a very specific understanding of why the whole
approach is unreasonable that you can then, say, take to the pandas and
netcdf developers as evidence of that you made a good faith effort and ask
them to meet you half way. And all these things require understanding the
specifics of what *exactly* works or doesn't work about about idea. IMHO,
it's extremely misleading at this stage to make any assertion about whether
Jaime's approach gives the "majority of benefits of such a change" is
extremely misleading at this stage: not because it's wrong, but because it
totally short-circuits the discussion about what benefits we care about.
Jaime's patch certainly has merits, but does it help us make numpy and
pandas/netcdf's more compatible? Does it make it easier for Eric to teach?
Those are some specific merits that we might care about a lot, and for
which Jaime's patch may or may not help much. But that kind of nuance gets
lost when we jump straight to debating thumbs-up versus thumbs-down.

I cross-my-heart promise that under the current regime, no PR breaking
fancy indexing would ever get anywhere *near* numpy master without
*extensive* discussion and warnings on the list. The core devs just spent
weeks quibbling about whether a PR that adds a new field to the end of the
dtype struct would break ABI backcompat (we're now pretty sure it doesn't),
and the current standard we enforce is that every PR that touches public
API needs a list discussion, even minor extensions with no compatibility
issues at all. No one is going to sneak anything by anyone.

Plus, I dunno, our current approach to discussions just seems to make
things hostile and shouty and unpleasant. If a grad student or junior
colleague comes to you with an idea where you see some potentially critical
flaw, do you yell THAT WILL NEVER WORK and kick them out of your office?
Or, do you maybe ask a few leading questions and see where they go?

I think things will work better if the next time something like this comes
up, *one* person just says "hmm, interesting idea, but the backcompat
issues seem pretty severe; do you have any ideas about how to mitigate
that?", and then we let that point be taken as having been made and see
where the discussion goes. Maybe we can all give it a try?

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150407/6a55e05d/attachment.html>