Re: [Numpy-discussion] Value based promotion and user DTypes

Jan. 27, 2021

      On Wed, 2021-01-27 at 18:16 +0100, Ralf Gommers wrote:
...
On Wed, Jan 27, 2021 at 5:44 PM Sebastian Berg <
sebastian@sipsolutions.net>
wrote:
...
On Wed, 2021-01-27 at 10:33 +0100, Ralf Gommers wrote:
...
On Tue, Jan 26, 2021 at 10:21 PM Sebastian Berg <
sebastian@sipsolutions.net>
wrote:
<snip>
Thanks for all the other comments, they are helpful. I am
considering
writing a (hopefully short) NEP, to define the direction of
thinking
here (and clarify what user DTypes can expect).  I don't like doing
that, but the issue turns out to have a lot of traps and confusing
points. (Our current logic alone is confusing enough...)
Sounds good, thanks.
...
...
...
The other tricky example I have was:
  The following becomes problematic (order does not matter):
          uint24 +      int16  +           uint32  -> int64
     <==      (uint24 + int16) + (uint24 + uint32) -> int64
     <==                int32  +           uint32  -> int64
With the addition that `uint24 + int32 -> int48` is defined the
first
could be expected to return `int48`, but actually getting there
is
tricky (and my current code will not).
If promotion result of a user DType with a builtin one, can be
a
builtin one, then "ammending" the promotion with things like
`uint24 +
int32 -> int48` can lead to slightly surprising promotion
results.
This happens if the result of a promotion with another
"category"
(builtin) can be both a larger category or a lower one.
I'm not sure I follow this. If uint24 and int48 both come from
the
same
third-party package, there is still a problem here?
Yes, at least unless you ask `uint24` to take over all of the work
(i.e. pass in all DTypes at once).
So with a binary operator design it is "problematic" (in the sense
that
you have to live with the above result). Of course a binary
operator
base does probably not preclude a more complex design.
I like a binary operator (it seems much easier to reason about and
is a
common design pattern).  But it would be plausible to have an n-ary
design where you pass all dtypes to each and ask them to handle it
(similar to `__array_ufunc__`).
We could even have both (the binary version for most things, but
the
ability to hook into the n-ary "reduction").
I'd say just document it and recommend that if >1 custom dtypes are
used,
then the user should (if they really care about the issue you bring
up)
determine the output dtype you want via some use of result_type and
then
explicitly cast.
Right, this is a problem that keeps giving...  Maybe a point of how
tricky Units are, but similar things will also apply to other
"families" of dtypes.
If you have Units (that can be based off any other NumPy numerical
type), you can break my scheme to work around the associativity issue
in the same way:

    Unit[int16] + uint16 + float16

has no clear hierarchy between them (Unit is the highest, but `float16`
dictates the precision).

So, probably we just shouldn't care too much about this (for now), but
if we want the above to return `Unit[float16]`, we must have additional
logic, to do reasonably... (aside from a binary operation)

I agree that these are all "insignificant" issues in many ways, since
most users will never even notice about the subtleties. So in some ways
my meandering towards binary-op only is that it feels at least small
enough in complexity that it hopefully doesn't make solutions for the
above much more complicated.

Cheers,

Sebastian
...
Cheers,
Ralf
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion