[Numpy-discussion] np.{bool,float,int} deprecation

Thu Dec 10 14:38:56 EST 2020

On Thu, Dec 10, 2020 at 7:25 PM Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> On Wed, 2020-12-09 at 13:37 -0800, Stephan Hoyer wrote:
> > On Wed, Dec 9, 2020 at 1:07 PM Aaron Meurer <asmeurer at gmail.com>
> > wrote:
> >
> > > On Wed, Dec 9, 2020 at 9:41 AM Sebastian Berg
> > > <sebastian at sipsolutions.net> wrote:
> > > >
> > > > On Mon, 2020-12-07 at 14:18 -0700, Aaron Meurer wrote:
> > > > > Regarding np.bool specifically, if you want to deprecate this,
> > > > > you
> > > > > might want to discuss this with us at the array API standard
> > > > > https://github.com/data-apis/array-api (which is currently in
> > > > > RFC
> > > > > stage). The spec uses bool as the name for the boolean dtype.
> > > > >
> > > > > Would it make sense for NumPy to change np.bool to just be the
> > > > > boolean
> > > > > dtype object? Unlike int and float, there is no ambiguity with
> > > > > bool,
> > > > > and NumPy clearly doesn't have any issues with shadowing
> > > > > builtin
> > > > > names
> > > > > in its namespace.
> > > >
> > > > We could keep the Python alias around (which for `dtype=` is the
> > > > same
> > > > as `np.bool_`).
> > > >
> > > > I am not sure I like the idea of immediately shadowing the
> > > > builtin.
> > > > That is a switch we can avoid flipping (without warning);
> > > > `np.bool_`
> > > > and `bool` are fairly different beasts? [1]
> > >
> > > NumPy already shadows a lot of builtins, in many cases, in ways
> > > that
> > > are incompatible with existing ones. It's not something I would
> > > have
> > > done personally, but it's been this way for a long time.
> > >
> >
> > It may be defensible to keep np.bool as an alias for Python's bool
> > even when we remove the other aliases.
>

I'd agree with that.

> That is true, `int` is probably the most confusing, since it is not at
> all compatible to a Python integer, but rather the "default" integer
> (which happens to be the same as C `long` currently).
>
> So we could focus on `np.int`, `np.long`.  I am a bit unsure whether
> you would prefer that or are mainly pointing out the possibility?
>

Not sure what you mean with focus, focus on describing in the release
notes? Deprecating `np.int` seems like the most beneficial part of this
whole exercise.

Right now, my main take-away from the discussion is that it would be
> good to clarify the release notes a bit more.
>
> Using `float` for a dtype seems fine to me, but I prefer mentioning
> `np.float64` over `np.float_`.
> For integers, I wonder if we should also suggest `np.int64`, even – or
> because – if the default integer on many systems is currently
> `np.int_`?
>

I agree. I think we should recommend sane, descriptive names that do the
right thing. So ideally we'd have people spell their dtype specifiers as
  dtype=bool  # or np.bool
  dtype=np.float64
  dtype=np.int64
  dtype=np.complex128
The names with underscores at the end make little sense from a UX
perspective. And the C equivalents (single/double/etc) made sense 15 years
ago, but with the user base of today - the majority of whom will not know C
fluently or at all - also don't make too much sense.

The `dtype=int` or `dtype=np.int_` behaviour flopping between 32 and 64
bits is likely to be a pitfall much more often than it is what the user
actually needs, so shouldn't be recommended and probably deserves a warning
in the docs.

Cheers,
Ralf

>
> >
> > np.int_ and np.float_ have fixed precision, which makes them somewhat
> > different from the builtin types. NumPy has a whole bunch of
> > different
> > precisions for integer and floats, so this distinction matters.
> >
> > In contrast, there is only one boolean dtype in NumPy, which matches
> > Python's bool. So we wouldn't have to worry, for example, about
> > whether a
> > user has requested a specific precision explicitly. This comes up in
> > issues
> > like type-promotion where libraries like JAX and PyTorch have special
> > case
> > logic for most Python types vs NumPy dtypes (but booleans are the
> > same for
> > both):
> > https://jax.readthedocs.io/en/latest/type_promotion.html
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20201210/b7f4deb0/attachment-0001.html>