Re: [Numpy-discussion] backwards compatibility and deprecation policy NEP

22 Jul 2018

      On Sat, Jul 21, 2018 at 7:15 PM, Nathaniel Smith <njs@pobox.com> wrote:
...
On Sat, Jul 21, 2018 at 4:48 PM, Ralf Gommers <ralf.gommers@gmail.com>
wrote:
...
Hi all,
Here is a first draft of a NEP on backwards compatibility and deprecation
policy. This I think mostly formalized what we've done for the last
couple
of years, however I'm sure opinions and wish lists will differ here.
Oh *awesome*, thanks for putting this together.
I think this is a great start, but I'd structure it a bit differently.
So let me just make a few high-level comments first and see what you
think.
Regarding the "general principles" and then "policy": to me these feel
like more a brainstorming list, that hasn't been fully distilled down
into principles yet. I would try to structure it to start with the
overarching principles (changes need to benefit users more than they
harm them, numpy is widely used so breaking changes should by default
be assumed to be fairly harmful, decisions should be based on data and
actual effects on users rather than e.g. appealing to the docs or
abstract aesthetic principles, silently getting wrong answer is much
worse than a loud error), then talk about some of the ways this plays
out (if people are currently silently getting the wrong answer --
which is the definition of a bug, but also shows up in the
index-by-float case -- then that's really bad; some of our tools for
collecting data about how bad a breakage is include testing prominent
downstreams ourselves, adding warnings or making .0 releases and
seeing how people react, etc.), and then examples.
Thanks, I'll try and rework the general principles, you have some excellent
points in here.
...
Speaking of examples: I hate to say this because in general I think
using examples is a great idea. But... I think you should delete most
of these examples. The problem is scope creep: the goal for this NEP
(IMO) should be to lay out the principles we use to think about these
issues in general, but right now it comes across as trying to lay down
a final resolution on lots of specific issues (including several where
there are ongoing conversations). It ends up like trying to squish
multiple NEPs into one, which makes it hard to discuss, and also
distracts from the core purpose.
I'm not sure this is the best thing to do. I can remove a couple, but
aiming to be "totally uncontroversial" is almost impossible given the topic
of the NEP. The diag view example is important I think, it's the second
most discussed backwards compatibility issue next to histogram. I'm happy
to remove the statement on what should happen with it going forward though.

Then, I think it's not unreasonable to draw a couple of hard lines. For
example, removing complete submodules like linalg or random has ended up on
some draft brainstorm roadmap list because someone (no idea who) put it
there after a single meeting. Clearly the cost-benefit of that is such that
there's no point even discussing that more, so I'd rather draw that line
here than every time someone open an issue. Very recent example:
https://github.com/numpy/numpy/issues/11457 (remove auto-import of
numpy.testing).
...
My suggestion: keep just two examples, histogram and
indexing-with-floats. These are safely done and dusted, totally
uncontroversial (AFAIK), and the first is a good illustration of how
one can try to be careful and do the right thing but still get it all
wrong, while the second is a good example of (a) how we gathered data
and decided that an actually pretty disruptive change was nonetheless
worth it, and (b) how we had to manage it through multiple false
starts.
Regarding the actual policy: One alteration to current practice jumped
out at me. This policy categorically rules out all changes that could
cause currently working code to silently start doing something wrong,
regardless of the specific circumstances. That's not how we actually
do things right now. Instead, our policy in recent years has been that
such changes are permitted in theory, but (a) the starting presumption
is that this is super harmful to users so we need a *very* good reason
to do it, and (b) if we do go ahead with it, then during the
deprecation period we use a highly-visible FutureWarning (instead of
the invisible-by-default DeprecationWarning).
Personally I think the current policy strikes a better balance. You
can see some examples of where we've used this by running 'git log -S
FUTUREWARNING -S FutureWarning' -- it's things like a bad default for
the rcond argument in lstsq, an obscure and error-prone corner case in
indexing (0addc016ba), strange semantics for NaT
(https://mail.scipy.org/pipermail/numpy-discussion/
2015-October/073968.html),
... we could quibble about individual cases, but I think that taking
these on a case-by-case basis is better than ruling them out
categorically. And in any case, that is what we do now, so if you want
to change this, it's something we should discuss and probably write
down some rationale and such :-).
You're right here. Thanks for the examples. I'll update this according to
your suggestion, and propose to use one of the examples (rcond probably) to
illustrate.
...
Regarding the major version number thing: ugh do we really want to
talk about this more. I'd probably leave it out of the NEP entirely.
If it stays in, I think it needs a clearer description of what counts
as a "major" change.
I think it has value to keep it, and that it's not really possible to come
up with a very clear description of "major". In particular, I'd like every
deprecation message to say "this deprecated feature will be removed by
release X.Y.0". At the moment we don't do that, so if users see a message
they don't know if a removal will happen next year, in the far future
(2.0), or never. The major version thing is quite useful to signal our
intent. Doesn't mean we need to exhaustively discuss when to do a 2.0
though, I agree that that's not a very useful discussion right now.

Happy to remove this though if people don't like it. Other opinions?

Cheers,
Ralf
...
There are some examples of things that do "sound"
major, but... the rest of our policy is all about measuring disruption
based on effects on users, and by that metric, the index-by-float
removal was pretty major. My guess is that by the time we finally
remove np.matrix, the actual disruption will be less than it was for
removing index-by-float. (As it should be, since keeping
index-by-float around was actively causing bugs in even
well-maintained downstreams, in a way that np.matrix doesn't.)