
On Wed, Jan 27, 2021 at 5:44 PM Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Wed, 2021-01-27 at 10:33 +0100, Ralf Gommers wrote:
On Tue, Jan 26, 2021 at 10:21 PM Sebastian Berg < sebastian@sipsolutions.net> wrote:
<snip>
Thanks for all the other comments, they are helpful. I am considering writing a (hopefully short) NEP, to define the direction of thinking here (and clarify what user DTypes can expect). I don't like doing that, but the issue turns out to have a lot of traps and confusing points. (Our current logic alone is confusing enough...)
Sounds good, thanks.
The other tricky example I have was:
The following becomes problematic (order does not matter): uint24 + int16 + uint32 -> int64 <== (uint24 + int16) + (uint24 + uint32) -> int64 <== int32 + uint32 -> int64
With the addition that `uint24 + int32 -> int48` is defined the first could be expected to return `int48`, but actually getting there is tricky (and my current code will not).
If promotion result of a user DType with a builtin one, can be a builtin one, then "ammending" the promotion with things like `uint24 + int32 -> int48` can lead to slightly surprising promotion results. This happens if the result of a promotion with another "category" (builtin) can be both a larger category or a lower one.
I'm not sure I follow this. If uint24 and int48 both come from the same third-party package, there is still a problem here?
Yes, at least unless you ask `uint24` to take over all of the work (i.e. pass in all DTypes at once). So with a binary operator design it is "problematic" (in the sense that you have to live with the above result). Of course a binary operator base does probably not preclude a more complex design. I like a binary operator (it seems much easier to reason about and is a common design pattern). But it would be plausible to have an n-ary design where you pass all dtypes to each and ask them to handle it (similar to `__array_ufunc__`). We could even have both (the binary version for most things, but the ability to hook into the n-ary "reduction").
I'd say just document it and recommend that if >1 custom dtypes are used, then the user should (if they really care about the issue you bring up) determine the output dtype you want via some use of result_type and then explicitly cast. Cheers, Ralf