I'm very glad to hear from you, Josh 😊, but I'm 100% convinced that removing the automatic rescaling is the right path forward. Stéfan, "floats between [0, 1]" is easy enough to explain, except when it isn't (signed filters), or when we automatically rescale int32s in [0, 255] to floats in [0, 2**(-31)], or uint16s in [0, 4095] to floats in [0, 2**(-4)], etc. I can't count the number of times I've had to point users to "Image data types and what they mean". Floats in [0, 1] is certainly not simpler to explain than "we use floats internally for computation, period." Yes, there is a chance that we'll now get users confused about uint8 overflow/underflow, but at least then we can teach them about fundamental computer science principles, rather than about how skimage does things "just so".
As Matthew pointed out, the user is best placed to know how to manage their data scales. When we do it automagically, we often mess up. And Stéfan, to steal from your approach, we can look to our values to guide our decision-making: "we don't do magic." Let's remove the last few places where we do.
Matthew, apologies for sounding callous to users — that is absolutely not my intent! Hence this email thread. The question when aiming for a new API is how to move the community forward without fracturing it. My suggestion of "upgrade pressure" was aimed at doing this, with the implicit assumption that *limited* short term pain would result in higher long-term gain — for all our users.
I'm certainly starting to be persuaded that skimage2 is indeed the best path forward, mainly so that we don't invalidate old Q&As and tutorials. We can perhaps do a combination, though:
skimage 0.19 is the last "real" release with the old API
skimage2 2.0 is the next real release
when skimage 2.0 is release, we release skimage 0.20, which is 0.19 with a warning that scikit-image is deprecated and no longer maintained, and point to the migration guide, and if you want to keep using the deprecated API, pin to 0.19 explicitly.
That probably satisfies my "migration pressure" requirement.
Juan.
On Fri, 23 Jul 2021, at 8:29 PM, Stefan van der Walt wrote:
Hi Tom,
On Fri, Jul 23, 2021, at 17:57, Thomas Caswell wrote:
Where the issues tend to show up is if you have enough dynamic range that the small end is less than difference between adjacent representable numbers at the high end e.g.
In [5]: 1e16 == (1e16 + 1)
Out[5]: True
This issue would crop up if you had, e.g., uint64 images utilizing the full range. We don't support uint64 images, and uint32 is OK still on this front if you use `float64` for calculations.
In some cases the scaling / unscaling does not work out the way you wish it would. While it is possible that the issues we are having are related to what we are doing with the results, forcing to [0, 1] restricts you to ~15 orders of magnitude on the whole image which seems not ideal. While it may not be common, that Matplotlib got those bug reports says we do have users with such extreme dynamic range in the community!
15 orders of magnitude is enormous! Note that all our floating point operations internally currently happen with float64 anyway—and this is pretty much the best you can do here.
The other issue you mention is due to interpolation that sometimes goes outside the desired range; but this is an expected artifact of interpolation (which we typically have the `clip` flag for).
To be clear, I'm trying to get a the underlying issues here and identify them; not to dismiss your concerns!
Best regards,
Stéfan
_______________________________________________
_______________________________________________