[Numpy-discussion] np.{bool,float,int} deprecation

Thu Dec 10 13:19:43 EST 2020

On Wed, 2020-12-09 at 13:37 -0800, Stephan Hoyer wrote:
> On Wed, Dec 9, 2020 at 1:07 PM Aaron Meurer <asmeurer at gmail.com>
> wrote:
> 
> > On Wed, Dec 9, 2020 at 9:41 AM Sebastian Berg
> > <sebastian at sipsolutions.net> wrote:
> > > 
> > > On Mon, 2020-12-07 at 14:18 -0700, Aaron Meurer wrote:
> > > > Regarding np.bool specifically, if you want to deprecate this,
> > > > you
> > > > might want to discuss this with us at the array API standard
> > > > https://github.com/data-apis/array-api (which is currently in
> > > > RFC
> > > > stage). The spec uses bool as the name for the boolean dtype.
> > > > 
> > > > Would it make sense for NumPy to change np.bool to just be the
> > > > boolean
> > > > dtype object? Unlike int and float, there is no ambiguity with
> > > > bool,
> > > > and NumPy clearly doesn't have any issues with shadowing
> > > > builtin
> > > > names
> > > > in its namespace.
> > > 
> > > We could keep the Python alias around (which for `dtype=` is the
> > > same
> > > as `np.bool_`).
> > > 
> > > I am not sure I like the idea of immediately shadowing the
> > > builtin.
> > > That is a switch we can avoid flipping (without warning);
> > > `np.bool_`
> > > and `bool` are fairly different beasts? [1]
> > 
> > NumPy already shadows a lot of builtins, in many cases, in ways
> > that
> > are incompatible with existing ones. It's not something I would
> > have
> > done personally, but it's been this way for a long time.
> > 
> 
> It may be defensible to keep np.bool as an alias for Python's bool
> even
> when we remove the other aliases.

That is true, `int` is probably the most confusing, since it is not at
all compatible to a Python integer, but rather the "default" integer
(which happens to be the same as C `long` currently).

So we could focus on `np.int`, `np.long`.  I am a bit unsure whether
you would prefer that or are mainly pointing out the possibility?

Right now, my main take-away from the discussion is that it would be
good to clarify the release notes a bit more.

Using `float` for a dtype seems fine to me, but I prefer mentioning
`np.float64` over `np.float_`.
For integers, I wonder if we should also suggest `np.int64`, even – or
because – if the default integer on many systems is currently
`np.int_`?

Cheers,

Sebastian

> 
> np.int_ and np.float_ have fixed precision, which makes them somewhat
> different from the builtin types. NumPy has a whole bunch of
> different
> precisions for integer and floats, so this distinction matters.
> 
> In contrast, there is only one boolean dtype in NumPy, which matches
> Python's bool. So we wouldn't have to worry, for example, about
> whether a
> user has requested a specific precision explicitly. This comes up in
> issues
> like type-promotion where libraries like JAX and PyTorch have special
> case
> logic for most Python types vs NumPy dtypes (but booleans are the
> same for
> both):
> https://jax.readthedocs.io/en/latest/type_promotion.html
> 
> 
> 
> > 
> > Aaron Meurer
> > 
> > > OTOH, if someone wants to entertain switching... It could be
> > > interesting to see how (unfixed) downstream projects react to it.
> > > 
> > > One approach would be:
> > > 
> > > * Go ahead for now (deprecate)
> > > * Add a FutureWarning at some point that we _will_ start to
> > > export
> > >   `np.bool` again (but `from numpy import *` is a problem?)
> > > * Aim to make `np.bool is np.bool_` at some point in the (far)
> > > future.
> > > 
> > > It is multi-step (and I recall opinions that multi-step is bad).
> > > Although, I think the main argument against it was to not force
> > > users
> > > to modify code more than once.  And I do not think that happens
> > > here.
> > > 
> > > Of course we could use the `FutureWarning` right away, but I
> > > don't mind
> > > taking it slow.
> > > 
> > > Cheers,
> > > 
> > > Sebastian
> > > 
> > > 
> > > 
> > > [1] I admit, probably almost nobody would notice. And usually
> > > using a
> > > Python `bool` is better...
> > > 
> > > 
> > > > 
> > > > Aaron Meurer
> > > > 
> > > > On Sat, Dec 5, 2020 at 4:31 PM Juan Nunez-Iglesias <
> > > > jni at fastmail.com>
> > > > wrote:
> > > > > Hi all,
> > > > > 
> > > > > At the prodding [1] of Sebastian, I’m starting a discussion
> > > > > on the
> > > > > decision to deprecate np.{bool,float,int}. This deprecation
> > > > > broke
> > > > > our prerelease testing in scikit-image (which, hooray for
> > > > > rcs!),
> > > > > and resulted in a large amount of code churn to fix [2].
> > > > > 
> > > > > To be honest, I do think *some* sort of deprecation is
> > > > > needed,
> > > > > because for the longest time I thought that np.float was what
> > > > > np.float_ actually is. I think it would be worthwhile to move
> > > > > to
> > > > > *that*, though it’s an even more invasive deprecation than
> > > > > the
> > > > > currently proposed one. Writing `x = np.zeros(5, dtype=int)`
> > > > > is
> > > > > somewhat magical, because someone with a strict typing
> > > > > mindset
> > > > > (there’s an increasing number!) might expect that this is an
> > > > > array
> > > > > of pointers to Python ints. This is why I’ve always preferred
> > > > > to
> > > > > write `dtype=np.int`, resulting in the current code churn.
> > > > > 
> > > > > I don’t know what the best answer is, just sparking the
> > > > > discussion
> > > > > Sebastian wants to see. ;) For skimage we’ve already merged a
> > > > > fix
> > > > > (even if it is one of dubious quality, as Stéfan points out
> > > > > [3] ;),
> > > > > so I don’t have too much stake in the outcome.
> > > > > 
> > > > > Juan.
> > > > > 
> > > > > [1]:
> > > > > 
> > https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739334463
> > > > > [2]: https://github.com/scikit-image/scikit-image/pull/5103
> > > > > [3]:
> > > > > 
> > https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739368765
> > > > > _______________________________________________
> > > > > NumPy-Discussion mailing list
> > > > > NumPy-Discussion at python.org
> > > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > > _______________________________________________
> > > > NumPy-Discussion mailing list
> > > > NumPy-Discussion at python.org
> > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > 
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion at python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20201210/56da1c27/attachment.sig>