[Numpy-discussion] np.{bool,float,int} deprecation
Andras Deak
deak.andris at gmail.com
Sat Dec 5 22:10:28 EST 2020
On Sun, Dec 6, 2020 at 12:31 AM Juan Nunez-Iglesias <jni at fastmail.com> wrote:
>
> Hi all,
>
> At the prodding [1] of Sebastian, I’m starting a discussion on the decision to deprecate np.{bool,float,int}. This deprecation broke our prerelease testing in scikit-image (which, hooray for rcs!), and resulted in a large amount of code churn to fix [2].
>
> To be honest, I do think *some* sort of deprecation is needed, because for the longest time I thought that np.float was what np.float_ actually is. I think it would be worthwhile to move to *that*, though it’s an even more invasive deprecation than the currently proposed one. Writing `x = np.zeros(5, dtype=int)` is somewhat magical, because someone with a strict typing mindset (there’s an increasing number!) might expect that this is an array of pointers to Python ints. This is why I’ve always preferred to write `dtype=np.int`, resulting in the current code churn.
>
> I don’t know what the best answer is, just sparking the discussion Sebastian wants to see. ;) For skimage we’ve already merged a fix (even if it is one of dubious quality, as Stéfan points out [3] ;), so I don’t have too much stake in the outcome.
Hi Juan,
Let me start with a disclaimer that I'm an end user, and as such it's
very easy for me to be bold when it comes to deprecations :)
But I experienced the same thing that you describe in
https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739429373
:
> [I]t was very surprising to me when I found out that np.float is float. For the longest time I thought that np.float was equivalent to "whatever the default float value is on my platform", and considered it best practice to use that instead of plain float. 😅 I think that is a common misconception.
And I'm pretty sure the vast majority of end users faces this. The
proper np.float32 and other types are intuitive enough that people
don't go out of their way to read the documentation in detail, and
it's highly unexpected that some `np.*` types are mere aliases.
Now, this should probably not be a problem as long as people only
stick these aliases into `dtype` keyword arguments, because that works
as expected (based on the wrong premise). But once you extrapolate
from the `dtype=np.int` behaviour to "`np.int` must be my native numpy
int type" you can easily get subtle bugs. For instance, you might
expect `isinstance(this_type, np.int)` to give you True if `this_type`
is the type of an item of an array with `dtype=np.int`.
To be fair I'm not sure that I've ever been bitten by this
personally... but once you're aware of the pitfall it seems really
ominous. I guess one helpful question is this: among all the code
churn needed to fix the breakage did you find any bugs that were
revealed by the deprecation? If that's the case (in scikit-image or
any other large downstream library) then that would be a good argument
for going forward with the deprecation.
Cheers,
András
> Juan.
>
> [1]: https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739334463
> [2]: https://github.com/scikit-image/scikit-image/pull/5103
> [3]: https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739368765
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
More information about the NumPy-Discussion
mailing list