[scikit-image] Re: Request for comments: plans for breaking changes in scikit-image 1.0

19 Jul 2021

      Hi,

On Mon, Jul 19, 2021 at 1:31 PM Matthew Brett <matthew.brett@gmail.com> wrote:
...
Hi,
On Mon, Jul 19, 2021 at 5:34 AM Juan Nunez-Iglesias <jni@fastmail.com> wrote:
...
Dear skimagers,
We are aiming to release scikit-image 1.0 near the end of the year. We are, however, planning to make a number of breaking changes in the API that will affect downstream libraries. We have published a proposal for how we plan to do this at https://bit.ly/skip-3. The gist of it is:
- we'll release 0.19 in the coming weeks.
- we'll release 0.20 immediately after, which will be exactly the same but with a warning to pin scikit-image to `<0.20` (for those that want to stay in 0.x land indefinitely) or `!=0.20.*` (for those that want to be "on the ball" when 1.0 is released and update their code as soon as possible).
- we'll publish a transition guide along with 1.0rc0, and maintain 0.19.x with bug fixes for another year to give users time to transition.
Please do give the appropriate weight to my remarks, given my tiny
contributions to scikit-image, but I was rather scared by reading this
suggestion.
I can see that you do need to change the API - and that the two
realistic options are:
* Make a breaking 1.0 release
* Make a new package e.g. skimage2 or similar.
I'm afraid I wasn't completely sure whether the 1.0 option would
result in breaking what I call the Konrad Hinsen rule for scientific
software:
"""
Under (virtually) no circumstances should new versions of a scientific
package silently give substantially different results for the same
function / method call from a previous version of the package.
"""
The idea there, is that lots of scientific software is in the form of
packages or scripts, that are not well maintained, but do often get
picked up and re-used, if only to replicate results.   They will very
rarely specify exact package versions.  It is a very serious problem
if a later version of a package actually does something substantially
different with the same function or method call than it did when the
script was written.
Fixing clear bugs and changes in algorithm implementation are fine -
the results need not be absolutely identical, only compatible.   It is
also fine to raise an error, for example for expired deprecations.  In
this case the person using the script has a warning, and can
investigate.  The disaster is if they don't know that the result has
changed.
So - do the changes all raise errors for the previous, now expired
API?  Or do they break the Hinsen rule?
The other thing that occurred to me is the same thing that I am sure
occurred to y'all - that is the experience of the Python 2 / Python 3
transition.

I think the basic message was that you can do big changes like that,
but you have to support your users maintaining code that will work
with old and new versions, for a fairly long time - otherwise you'll
risk leaving a lot of developers on the old version, waiting until the
new version has been around for long enough that it has completely
replaced the old.

How practical will that be - supporting both scikit-image versions
<=0.18 and >=1.0, within the same library?

Cheers,

Matthew