On Apr 4, 2015 10:54 AM, "Nathaniel Smith" <njs@pobox.com> wrote:
>
> On Sat, Apr 4, 2015 at 12:17 AM, Ralf Gommers <ralf.gommers@gmail.com> wrote:
> >
> >
> > On Sat, Apr 4, 2015 at 1:54 AM, Nathaniel Smith <njs@pobox.com> wrote:
> >>
> >>
> >> But, the real problem here is that we have two different array duck
> >> types that force everyone to write their code twice. This is a
> >> terrible state of affairs! (And exactly analogous to the problems
> >> caused by np.ndarray disagreeing with np.matrix & scipy.sparse about
> >> the the proper definition of *, which PEP 465 may eventually
> >> alleviate.) IMO we should be solving this indexing problem directly,
> >> not applying bandaids to its symptoms, and the way to do that is to
> >> come up with some common duck type that everyone can agree on.
> >>
> >> Unfortunately, AFAICT this means our only options here are to have
> >> some kind of backcompat break in numpy, some kind of backcompat break
> >> in pandas, or to do nothing and continue indefinitely with the status
> >> quo where the same indexing operation might silently return different
> >> results depending on the types passed in. All of these options have
> >> real costs for users, and it isn't at all clear to me what the
> >> relative costs will be when we dig into the details of our various
> >> options.
> >
> >
> > I doubt that there is a reasonable way to quantify those costs, especially
> > those of breaking backwards compatibility. If someone has a good method, I'd
> > be interested though.
>
> I'm a little nervous about how easily this argument might turn into
> "either A or B is better but we can't be 100% *certain* which it is so
> instead of doing our best using the data available we should just
> choose B." Being a maintainer means accepting uncertainty and doing
> our best anyway.

I think the burden of proof needs to be on the side proposing a change, and the more invasive the change the higher that burden needs to be.

When faced with a situation like this, where the proposed change will cause fundamental alterations to the most basic, high-level operation of numpy, and where the is an alternative approach with no backwards-compatibility issues, I think the burden of proof would necessarily be nearly impossibly large.

> But that said I'm still totally on board with erring on the side of
> caution (in particular, you can never go back and *un*break
> backcompat). An obvious challenge to anyone trying to take this
> forward (in any direction!) would definitely be to gather the most
> useful data possible. And it's not obviously impossible -- maybe one
> could do something useful by scanning ASTs of lots of packages (I have
> a copy of pypi if anyone wants it, that I downloaded with the idea of
> making some similar arguments for why core python should slightly
> break backcompat to allow overloading of a < b < c syntax), or adding
> instrumentation to numpy, or running small-scale usability tests, or
> surveying people, or ...
>
> (I was pretty surprised by some of the data gathered during the PEP
> 465 process, e.g. on how common dot() calls are relative to existing
> built-in operators, and on its associativity in practice.)

Surveys like this have the problem of small sample size and selection bias. Usability studies can't measure the effect of the compatibility break, not to mention the effect on numpy's reputation. This is considerably more difficult to scan existing projects for than .dot because it depends on the type being passed (which may not even be defined in the same project). And I am not sure I much like the idea of numpy "phoning home" by default, and an opt-in had the same issues as a survey.

So to make a long story short, in this sort of situation I have a hard time imaging ways to get enough reliable, representative data to justify this level of backwards compatibility break.

> Core python broke backcompat on a regular basis throughout the python
> 2 series, and almost certainly will again -- the bar to doing so is
> *very* high, and they use elaborate mechanisms to ease the way
> (__future__, etc.), but they do it. A few months ago there was even
> some serious consideration given to changing py3 bytestring indexing
> to return bytestrings instead of integers. (Consensus was
> unsurprisingly that this was a bad idea, but there were core devs
> seriously exploring it, and no-one complained about the optics.)

There was no break as large as this. In fact I would say this is even a larger change than any individual change we saw in the python 2 to 3 switch. The basic mechanics of indexing are just too fundamental and touch on too many things to make this sort of change feasible. It would be better to have a new language, or in this case anew project.

> It's true that numpy has something of a bad reputation in this area,
> and I think it's because until ~1.7 or so, we randomly broke stuff by
> accident on a pretty regular basis, even in "bug fix" releases. I
> think the way to rebuild that trust is to honestly say to our users
> that when we do break backcompat, we will never do it by accident, and
> we will do it only rarely, after careful consideration, with the
> smoothest transition possible, only in situations where we are
> convinced that it the net best possible solution for our users, and
> only after public discussion and getting buy-in from stakeholders
> (e.g. major projects affected). And then follow through on that to the
> best of our ability. We've certainly gotten a lot better at this over
> the last few years.
>
> If we say we'll *never* break backcompat then we'll inevitably end up
> convincing some people that we're liars, just because one person's
> bugfix is another's backcompat break. (And they're right, it is a
> backcompat break; it's just one where the benefits of the fix
> obviously outweigh the cost of the break.) Or we could actually avoid
> breaking backcompat by descending into Knuth-style stasis... but even
> there notice that none of us are actually using Knuth's TeX, we all
> use forks like XeTeX that have further changes added, which goes to
> show how futile this would be.

I think it is fair to say that some things are just too fundamental to what makes numpy numpy that they are off-limits, that people will always be able to count on those working.