[Numpy-discussion] Value based promotion and user DTypes

Sebastian Berg sebastian at sipsolutions.net
Wed Jan 27 13:18:09 EST 2021


On Wed, 2021-01-27 at 18:16 +0100, Ralf Gommers wrote:
> On Wed, Jan 27, 2021 at 5:44 PM Sebastian Berg <
> sebastian at sipsolutions.net>
> wrote:
> 
> > On Wed, 2021-01-27 at 10:33 +0100, Ralf Gommers wrote:
> > > On Tue, Jan 26, 2021 at 10:21 PM Sebastian Berg <
> > > sebastian at sipsolutions.net>
> > > wrote:
> > > 
> > <snip>
> > 
> > Thanks for all the other comments, they are helpful. I am
> > considering
> > writing a (hopefully short) NEP, to define the direction of
> > thinking
> > here (and clarify what user DTypes can expect).  I don't like doing
> > that, but the issue turns out to have a lot of traps and confusing
> > points. (Our current logic alone is confusing enough...)
> > 
> 
> Sounds good, thanks.
> 
> 
> > > 
> > > > The other tricky example I have was:
> > > > 
> > > >   The following becomes problematic (order does not matter):
> > > >           uint24 +      int16  +           uint32  -> int64
> > > >      <==      (uint24 + int16) + (uint24 + uint32) -> int64
> > > >      <==                int32  +           uint32  -> int64
> > > > 
> > > > With the addition that `uint24 + int32 -> int48` is defined the
> > > > first
> > > > could be expected to return `int48`, but actually getting there
> > > > is
> > > > tricky (and my current code will not).
> > > > 
> > > > If promotion result of a user DType with a builtin one, can be
> > > > a
> > > > builtin one, then "ammending" the promotion with things like
> > > > `uint24 +
> > > > int32 -> int48` can lead to slightly surprising promotion
> > > > results.
> > > > This happens if the result of a promotion with another
> > > > "category"
> > > > (builtin) can be both a larger category or a lower one.
> > > > 
> > > 
> > > I'm not sure I follow this. If uint24 and int48 both come from
> > > the
> > > same
> > > third-party package, there is still a problem here?
> > > 
> > 
> > Yes, at least unless you ask `uint24` to take over all of the work
> > (i.e. pass in all DTypes at once).
> > So with a binary operator design it is "problematic" (in the sense
> > that
> > you have to live with the above result). Of course a binary
> > operator
> > base does probably not preclude a more complex design.
> > I like a binary operator (it seems much easier to reason about and
> > is a
> > common design pattern).  But it would be plausible to have an n-ary
> > design where you pass all dtypes to each and ask them to handle it
> > (similar to `__array_ufunc__`).
> > We could even have both (the binary version for most things, but
> > the
> > ability to hook into the n-ary "reduction").
> > 
> 
> I'd say just document it and recommend that if >1 custom dtypes are
> used,
> then the user should (if they really care about the issue you bring
> up)
> determine the output dtype you want via some use of result_type and
> then
> explicitly cast.
> 

Right, this is a problem that keeps giving...  Maybe a point of how
tricky Units are, but similar things will also apply to other
"families" of dtypes.
If you have Units (that can be based off any other NumPy numerical
type), you can break my scheme to work around the associativity issue
in the same way:

    Unit[int16] + uint16 + float16

has no clear hierarchy between them (Unit is the highest, but `float16`
dictates the precision).

So, probably we just shouldn't care too much about this (for now), but
if we want the above to return `Unit[float16]`, we must have additional
logic, to do reasonably... (aside from a binary operation)

I agree that these are all "insignificant" issues in many ways, since
most users will never even notice about the subtleties. So in some ways
my meandering towards binary-op only is that it feels at least small
enough in complexity that it hopefully doesn't make solutions for the
above much more complicated.

Cheers,

Sebastian



> Cheers,
> Ralf
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20210127/b8e72e7e/attachment.sig>


More information about the NumPy-Discussion mailing list