Re: [Numpy-discussion] Value based promotion and user DTypes

Jan. 27, 2021


      On Wed, Jan 27, 2021 at 5:44 PM Sebastian Berg <sebastian@sipsolutions.net>
wrote:
...
On Wed, 2021-01-27 at 10:33 +0100, Ralf Gommers wrote:
...
On Tue, Jan 26, 2021 at 10:21 PM Sebastian Berg <
sebastian@sipsolutions.net>
wrote:
<snip>
Thanks for all the other comments, they are helpful. I am considering
writing a (hopefully short) NEP, to define the direction of thinking
here (and clarify what user DTypes can expect).  I don't like doing
that, but the issue turns out to have a lot of traps and confusing
points. (Our current logic alone is confusing enough...)
Sounds good, thanks.
...
...
...
The other tricky example I have was:
The following becomes problematic (order does not matter):
          uint24 +      int16  +           uint32  -> int64
     <==      (uint24 + int16) + (uint24 + uint32) -> int64
     <==                int32  +           uint32  -> int64
With the addition that `uint24 + int32 -> int48` is defined the
first
could be expected to return `int48`, but actually getting there is
tricky (and my current code will not).
If promotion result of a user DType with a builtin one, can be a
builtin one, then "ammending" the promotion with things like
`uint24 +
int32 -> int48` can lead to slightly surprising promotion results.
This happens if the result of a promotion with another "category"
(builtin) can be both a larger category or a lower one.
I'm not sure I follow this. If uint24 and int48 both come from the
same
third-party package, there is still a problem here?
Yes, at least unless you ask `uint24` to take over all of the work
(i.e. pass in all DTypes at once).
So with a binary operator design it is "problematic" (in the sense that
you have to live with the above result). Of course a binary operator
base does probably not preclude a more complex design.
I like a binary operator (it seems much easier to reason about and is a
common design pattern).  But it would be plausible to have an n-ary
design where you pass all dtypes to each and ask them to handle it
(similar to `__array_ufunc__`).
We could even have both (the binary version for most things, but the
ability to hook into the n-ary "reduction").
I'd say just document it and recommend that if >1 custom dtypes are used,
then the user should (if they really care about the issue you bring up)
determine the output dtype you want via some use of result_type and then
explicitly cast.

Cheers,
Ralf