[scikit-image] Re: Request for comments: plans for breaking changes in scikit-image 1.0

27 Jul 2021

      ...
And sorry for the automatic french grammar corrections :(
Le mercredi 28 juillet 2021, Riadh Fezzani <rfezzani@gmail.com> a écrit :
...
Hello and sorry for the short answer from my phone,
As you may already know, I prefer the skimage.v0 option tout skimage2.
Concerning the preserve_range problème, what about making it keyword
only in v1? No silent error in this case...
Riadh
Le lundi 26 juillet 2021, Gregory Lee <grlee77@gmail.com> a écrit :
...
On Mon, Jul 26, 2021 at 1:57 AM Juan Nunez-Iglesias <jni@fastmail.com>
wrote:
...
...
Hey, on cue! Anyone care to answer this?
https://stackoverflow.com/questions/68487902/why-does-the-variance-of-laplac...
...
;)
Thanks Michael for chiming in. User (even long-ago user) feedback is
*the most valuable* in this situation, as we maintainers can become quite
detached from “majority” “real-world” use cases. =)
Stéfan, my point is that the rescaling is only one of several issues,
all of which require similar acrobatics to achieve. We have limited
developer time to do full deprecation cycles, so making all of them
together *and in one go* (rather than a 2-4 version deprecation dance) is a
...
...
...
...
Regarding consensus on *how* to make a clean break, I think that you
are correct that my maybe-too-clever-and-untested force-everyone-to-pin
approach is dead in the water. But I also think that you understate the
amount of consensus on the skimage2 approach: most people seem pretty much
on board with it including early detractors like Alex, *and* it has
successful models in the community (eg. bs4 and cv2).
I also think that “raise an error on anything other than floats in
0-1” is an approach that will annoy many and benefit few. In other words,
in my opinion: not rescaling but accepting all dtypes has usability
benefits, in addition to hopefully reducing maintainer load, *but* raising
errors on all inputs other than floats in [0, 1] will also presumably
reduce maintainer load in the long term, but at the cost of (probably
significant) user annoyance.
I am more in favor of a skimage2 (or similar) approach than the pinning
approach in the SKIP, particularly as the discussion here has progressed.
Regarding automatically scaling to [0, 1], I am definitely not in favor
of going back to that for floating point data! We changed `img_as_float` to
...
...
...
...
Josh, we do have such a documentation page:
https://scikit-image.org/docs/dev/user_guide/data_types.html
Unfortunately, it is not trivial to discover. Even with a big fat link
on the front page, I suspect most users won’t find it before asking for
help, because navigating documentation is hard. FAQs/documentation links
are very good and necessary when complexity is *unavoidable*, but when it
is avoidable, they are a bandaid.
...
We have fortunately been able to get quite a few users to visit that
...
...
...
...
Again, I’d prefer to point people to documentation explaining
fundamentals rather than “this is just the skimage way.”
...
...
My proposal going forward is to reject SKIP-3 and create a SKIP-4
...
...
...
...
Juan.
PS: Tom, I know you expressed preference for skimage.v0/skimage.v1,
but the main advantage you stated there (depending on both, migrating
gradually) is also present with skimage2.
On 26 Jul 2021, at 8:30 am, K.-Michael Aye <kmichael.aye@gmail.com>
wrote:
Hi all,
as a scientific image user I have been reading along this difficult
...
...
...
...
Let me first pay my respect that these difficult and, by nature,
opinionated (which is good!) discussions are being performed in such a
civil manner!
As someone who is member in a technical committee for Python
software myself, I know how hard this can be..
Now to the issue at hand, I was wondering if this could be tackled as
it's done in Space/Tech engineering, with a requirements documents that all
should agree on, from which maybe the one and only obvious solution will
emerge?
I wanted to mention my personal requirements for working with an image
...
...
...
...
Please forgive me if all of this already happens in skimage, but it's
been a while since I was using it:
First, and for me the most important:
Pixel values are sacred and shall never be changed without letting the
user know.
I'm almost sure that this is the case with skimage now, but in the
early days I remember I was highly surprised, annoyed even, when some
routine simply insisted that the input data needs to be so and so and the
result will be this format, no matter what came in. It simply resulted in
being less useful for me (no complaint, I know I could have done some PRs
;) ).
I will admit that us instrumentalists are completely ignorant of
certain standards in proper image formats, we simply use them as co-located
data containers.
This means they can be ANY format:
* Integers (both signed and unsigned)
- with counts as high as the digitized signal required for determining
...
...
...
...
* Floats, often representing physical values after the integer format
version was calibrated, but with absolutely no sensible/reasonable way to
force them into some kind of range. The pixel values represent physics
values, they don't care that a float image shouldn't be larger than 1.0
but the fact is, ALL of these pixel values are measurements with a
meaning and they absolutely need to be preserved.
This statement needs to be qualified though with "within reason", as
obviously some "wanted" operation like a median filter to remove noise will
change pixel values, but is indeed range preserving and the meaning of the
data isn't lost.
I understand that certain algorithms require the incoming image to be
in a certain format and range, and if no "standard" wrapper can be
identified that could transform and back-transform into the same range,
...
...
...
...
I for myself am lucky that I do not have a lot of code that I would
need to change, so I wouldn't really mind any import name changes, so I
...
...
...
...
I just wanted to emphasize how important the pixel values can be for
us, as they literally represent the bearer of the truth from outer space,
so to speak, and any change of their values shall be done only under full
consideration of the consequences.
My 2 opinionated cents.
As always, thanks so much for everybody's effort for this project, we
soon will have a technical-committee-reviewed package of many of my
...
...
...
...
Best regards,
Michael
On Sun, Jul 25, 2021 at 11:40 AM Josh Warner <
silvertrumpet999@gmail.com> wrote:
...
I'll be brief as my internet is currently down, replying from mobile.
Of these examples and similar, I would characterize them in a couple
categories
...
1. Data range user errors - the user used (almost always an overly
large) type for their actual data and they end up with an image which looks
all black/gray/etc.
2. Signed data of course needs to include the symmetric range [-1, 1]
as a generalization of unsigned workflow, which happens naturally since
float64 is signed.
3. Overshoots/undershoots due to expected computational effects, as
mentioned elsewhere in this thread; user may or may not want to retain
...
...
...
...
...
These do represent a low level support burden - but since the story
is predictable, presents the opportunity to guide users toward a FAQ or
similar before filling a new Issue.  That would certainly be less
disruptive than the solutions proposed!
I would assert anyone working in this space NEEDS to understand their
data and its representation or they will have serious problems.  It is so
foundational that insulating them from the concept doesn't do them favors.
That said the workings and logic of dtype.py are somewhat opaque.
Could a featured, direct, high-yield document informing users about our
conversion behavior and a FAQ serve users just as well as the heroic
efforts suggested?
Josh
On Sat, Jul 24, 2021, 19:59 Juan Nunez-Iglesias <jni@fastmail.com>
wrote:
...
I'm very glad to hear from you, Josh 😊, but I'm 100% convinced that
removing the automatic rescaling is the right path forward. Stéfan, "floats
between [0, 1]" is easy enough to explain, except when it isn't (signed
filters), or when we automatically rescale int32s in [0, 255] to floats in
[0, 2**(-31)], or uint16s in [0, 4095] to floats in [0, 2**(-4)], etc. I
can't count the number of times I've had to point users to "Image data
types and what they mean". Floats in [0, 1] is certainly not simpler to
explain than "we use floats internally for computation, period." Yes, there
is a chance that we'll now get users confused about uint8
overflow/underflow, but at least then we can teach them about fundamental
computer science principles, rather than about how skimage does things
"just so".
...
As Matthew pointed out, the user is best placed to know how to
manage their data scales. When we do it automagically, we often mess up.
And Stéfan, to steal from your approach, we can look to our values to guide
our decision-making: "we don't do magic." Let's remove the last few places
where we do.
Matthew, apologies for sounding callous to users — that is
absolutely not my intent! Hence this email thread. The question when aiming
for a new API is how to move the community forward without fracturing it.
My suggestion of "upgrade pressure" was aimed at doing this, with the
implicit assumption that *limited* short term pain would result in higher
long-term gain — for all our users.
I'm certainly starting to be persuaded that skimage2 is indeed the
best path forward, mainly so that we don't invalidate old Q&As and
tutorials. We can perhaps do a combination, though:
skimage 0.19 is the last "real" release with the old API
skimage2 2.0 is the next real release
when skimage 2.0 is release, we release skimage 0.20, which is 0.19
with a warning that scikit-image is deprecated and no longer maintained,
and point to the migration guide, and if you want to keep using the
deprecated API, pin to 0.19 explicitly.
That probably satisfies my "migration pressure" requirement.
Juan.
On Fri, 23 Jul 2021, at 8:29 PM, Stefan van der Walt wrote:
Hi Tom,
On Fri, Jul 23, 2021, at 17:57, Thomas Caswell wrote:
See around
https://github.com/matplotlib/matplotlib/blob/88f53b12e1443a9ae046ee55d1f1d6...
 https://github.com/matplotlib/matplotlib/pull/17636,
https://github.com/matplotlib/matplotlib/pull/10613,
https://github.com/matplotlib/matplotlib/pull/10133
Where the issues tend to show up is if you have enough dynamic range
...
...
...
...
...
...
In [5]: 1e16 == (1e16 + 1)
Out[5]: True
This issue would crop up if you had, e.g., uint64 images utilizing
...
...
...
...
...
...
In some cases the scaling / unscaling does not work out the way you
wish it would.  While it is possible that the issues we are having are
related to what we are doing with the results, forcing to [0, 1] restricts
you to ~15 orders of magnitude on the whole image which seems not ideal.
While it may not be common, that Matplotlib got those bug reports says we
do have users with such extreme dynamic range in the community!
...
15 orders of magnitude is enormous!  Note that all our floating
And I meant making preserve_range required argument x)

Le mercredi 28 juillet 2021, Riadh Fezzani <rfezzani@gmail.com> a écrit :
preferable approach.
preserve range on float inputs quite a while ago at this point and it would
be pretty annoying to users to switch it back again. I can see some
argument for keeping the current scaling for the integer cases, but I still
have a feeling it is likely better to not force rescaling there either.
Having data unexpectedly rescaled was probably the most annoying aspect to
me as a user in medical imaging applications. An additional point in favor
of not rescaling is consistency with scipy.ndimage.
page (>20k visitors in the past 6 months according to our web metrics). It
is the most-visited of the user guide pages, but not as visited as the
installation page or some of the example and API pages.
proposing the skimage2 package.
thread.
library.
the dynamic range, sometime negative because some weird amplifier randomly
would suck off electrons, who knows what the engineers are cooking with ...
;)
then the user should be pointed to workarounds, but not left alone simply
with the error message that the format doesn't match the algorithm.
think it might be much more of an discussion for the maintainers which way
minimize a convolution of maintainer_effort with user_pain, but honesty,
knowing how hard it is to find extra time for a passion volunteer-effort
project, I'd almost always go for "least effort", because I think the
community will come around (as can be seen with cv2 and other examples).
planetary science tools for data retrieval and data reading coming out, so
I'm kinda feeling now how much damned work it is to design tools for the
"community"...
these, and are uncommon.
that the small end is less than difference between adjacent representable
numbers at the high end e.g.
the full range.  We don't support uint64 images, and uint32 is OK still on
this front if you use `float64` for calculations.
point operations internally currently happen with float64 anyway—and this
is pretty much the best you can do here.
...
...
...
...
...
...
The other issue you mention is due to interpolation that sometimes
goes outside the desired range; but this is an expected artifact of
interpolation (which we typically have the `clip` flag for).
...
To be clear, I'm trying to get a the underlying issues here and
identify them; not to dismiss your concerns!
...
Best regards,
Stéfan
_______________________________________________
scikit-image mailing list -- scikit-image@python.org
To unsubscribe send an email to scikit-image-leave@python.org
https://mail.python.org/mailman3/lists/scikit-image.python.org/
Member address: jni@fastmail.com
_______________________________________________
scikit-image mailing list -- scikit-image@python.org
To unsubscribe send an email to scikit-image-leave@python.org
https://mail.python.org/mailman3/lists/scikit-image.python.org/
Member address: silvertrumpet999@gmail.com
_______________________________________________
scikit-image mailing list -- scikit-image@python.org
To unsubscribe send an email to scikit-image-leave@python.org
https://mail.python.org/mailman3/lists/scikit-image.python.org/
Member address: kmichael.aye@gmail.com
_______________________________________________
scikit-image mailing list -- scikit-image@python.org
To unsubscribe send an email to scikit-image-leave@python.org
https://mail.python.org/mailman3/lists/scikit-image.python.org/
Member address: jni@fastmail.com
_______________________________________________
scikit-image mailing list -- scikit-image@python.org
To unsubscribe send an email to scikit-image-leave@python.org
https://mail.python.org/mailman3/lists/scikit-image.python.org/
Member address: grlee77@gmail.com