[Numpy-discussion] Type resolver related deprecations and changes
Sebastian Berg
sebastian at sipsolutions.net
Fri Apr 2 16:46:33 EDT 2021
Hi all,
I have to do some changes to the type resolution, and I started these
here: https://github.com/numpy/numpy/pull/18718
There are four changes:
* Deprecate `signature="l"` and `signature=("l",)`, these are confusing
since the signature should include all inputs and outputs. To only
provide the output use `dtype="l"`.
* Using `dtype=` for comparisons (e.g. `np.equal`) used to be weird:
np.equal(1, 2, dtype=object) -> returns boolean
np.equal(None, 2, dtype=object) -> returns object array
The first one will now give a FutureWarning. Comparisons that provide
a dtype other than object or bool give a DeprecationWarning (or
fail).
I hope the warning can be preserved when more refactoring happens.
* NumPy *almost* always ignores any metadata, byte-order, time unit
information from the `dtype` or `signature` arguments to ufuncs.
Practically, the dtypes passed actually denote the DType types
rather than the specific instance (which could be byte swapped).
NumPy will now do this always and give a warning if byte-order or
time unit is ignored!
* It is THEORETICALLY possible to call `ufunc->type_resolver` in the C
API (as opposed to providing it, which is somewhat OK).
If someone does that they have to normalize the type tuple now, I
don't really see a reason for keeping support, when NumPy will stop
calling it itself almost always and anyone using it would probably be
in trouble soon.
To be clear: I have NOT found a single instance of such code in a
code search. Even *providing* it – which is much more reasonable –
is probably only done by astropy/pyerfa.
** Long example for the "time unit" dropping change **
For the third point, which is in theory the largest impact. Both pandas
and astropy do not notice it (I also grepped scipy, its clean).
These are the biggest changes:
# The following will now warn on most systems (unchanged result):
np.add(3, 5, dtype=">i32")
# The biggest impact is for timedelta or datetimes:
arr = np.arange(10, dtype="m8[s]")
# The examples always ignored the time unit "ns" (using the
# unit of `arr`. They now issue a warning:
np.add(arr, arr, dtype="m8[ns]")
np.maximum.reduce(arr, dtype="m8[ns]")
# The following issue a warning but previously did return
# a "ns" result.
np.add(3, 5, dtype="m8[ns]") # Now return generic time units
np.maximum(arr, arr, dtype="m8[ns]") # Now returns "s" (from `arr`)
I doubt there is a good way to just keep the old behaviour. It is
hopelessly inconsistent. (The result even depends on how you pass
things, as the paths I deprecate align with `dtype=` but not with the
`signature=(None, None, dtype)` equivalent call.)
One thought I had just raising a hard error instead of a UserWarning
right now.
If you say that why not have `dtype=...` always be honored as the
correct output dtype, I don't disagree. But it seems to me we probably
would have wade through a FutureWarning first, so a warning is the
"right direction". (Or just add an `output_dtypes` keyword argument.)
Cheers,
Sebastian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20210402/6b33df9a/attachment.sig>
More information about the NumPy-Discussion
mailing list