[Numpy-discussion] np.{bool,float,int} deprecation
Sebastian Berg
sebastian at sipsolutions.net
Sat Dec 12 14:25:26 EST 2020
On Sat, 2020-12-12 at 12:34 +1100, Juan Nunez-Iglesias wrote:
>
> > I agree. I think we should recommend sane, descriptive names that
> > do the right thing. So ideally we'd have people spell their dtype
> > specifiers as
> > dtype=bool # or np.bool
> > dtype=np.float64
> > dtype=np.int64
> > dtype=np.complex128
> > The names with underscores at the end make little sense from a UX
> > perspective. And the C equivalents (single/double/etc) made sense
> > 15 years ago, but with the user base of today - the majority of
> > whom will not know C fluently or at all - also don't make too much
> > sense.
> >
> > The `dtype=int` or `dtype=np.int_` behaviour flopping between 32
> > and 64 bits is likely to be a pitfall much more often than it is
> > what the user actually needs, so shouldn't be recommended and
> > probably deserves a warning in the docs.
>
> I kinda disagree with this. I want to have a way to say, give me an
> array of the same type as the default NumPy type (for either ints or
> floats). This will prevent casting back and forth as different arrays
> are combined. In other words, as long as NumPy itself flips back and
> forth (depending on locale), I think users will in many cases want to
> flip back and forth with it?
But "default" in NumPy really doesn't mean a whole lot? I can think of
three places where "defaults" exists:
1. `np.array([1])` will default to a C-long (as will `np.uint8(1) + 1`)
2. Sum and product upcast to C-long (and pretty much only those):
np.sum(np.arange(10, dtype=np.int8))
np.product(np.arange(10, dtype=np.int8))
3. NumPy uses `np.intp` for all indexing operations internally and
some functions many functions which return integers related to
indexing (e.g. `np.nonzero()`). [1]
The first two points have no logic at all besides: windows thinks long
is always 32bit and others think long is 64bit on 64bit systems. The
last point does have some logic.
Generally, the only reason to stick to a certain type would be that
mixing types can be slower (using a non `intp` to index or doing math
with a mix of 32bit and 64bit integers).
From a library perspective, I wonder how often you actually expect a
"default integer" input, as opposed to 32bit or 64bit depending on the
whims of the user; or `intp` because it is "indexing related".
It would be interesting to see if we can change the default at some
point. It might also be tricky: There may be quite a bit of code
expecting `long` (e.g. Cython extensions or `scipy.special` may or may
not notice such a change).
Cheers,
Sebastian
[1] intp is technically intptr_t in C, while indexing only requires an
ssize_t I think. That probably matters on no currently supported
systems, but system where it matters do exist (OpenVMS is one that just
came up, and we may support in the future).
>
> Juan.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20201212/ca2aad54/attachment.sig>
More information about the NumPy-Discussion
mailing list