<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Jan 26, 2021 at 10:21 PM Sebastian Berg <<a href="mailto:sebastian@sipsolutions.net">sebastian@sipsolutions.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Tue, 2021-01-26 at 06:11 +0100, Ralf Gommers wrote:<br>

> On Tue, Jan 26, 2021 at 2:01 AM Sebastian Berg <  <br>

> <a href="mailto:sebastian@sipsolutions.net" target="_blank">sebastian@sipsolutions.net</a>><br>

> wrote:<br>

> <br>

> > Hi all,<br>

> > <br>

> > does anyone have a thought about how user DTypes (i.e. DTypes not<br>

> > currently part of NumPy) should interact with the "value based<br>

> > promotion" logic we currently have?<br>

> > For now I can just do anything, and we will find out later.  And I<br>

> > will<br>

> > have to do something for now, basically with the hope that it all<br>

> > turns<br>

> > out all-right.<br>

> > <br>

> > But there are multiple options for both what to offer to user<br>

> > DTypes<br>

> > and where we want to move (I am using `bfloat16` as a potential<br>

> > DType<br>

> > here).<br>

> > <br>

> > 1. The "weak" dtype option (this is what JAX does), where:<br>

> > <br>

> >        np.array([1], dtype=bfloat16) + 4.<br>

> > <br>

> >    returns a bfloat16, because 4. is "lower" than all floating<br>

> >    point types.<br>

> >    In this scheme the user defined `bfloat16` knows that the input<br>

> >    is a Python float, but it does not know its value (if an<br>

> >    overflow occurs during conversion, it could warn or error but<br>

> >    not upcast).  For example `np.array([1], dtype=uint4) + 2**5`<br>

> >    will try `uint4(2**5)` assuming it works.<br>

> >    NumPy is different `2.**300` would ensure the result is a<br>

> > `float64`.<br>

> > <br>

> >    If a DType does not make use of this, it would get the behaviour<br>

> >    of option 2.<br>

> > <br>

> > 2. The "default" DType option: np.array([1], dtype=bfloat16) + 4.<br>

> > is<br>

> >    always the same as `bfloat16 + float64 -> float64`.<br>

> > <br>

> > 3. Use whatever NumPy considers the "smallest appropriate dtype".<br>

> >    This will not always work correctly for unsigned integers, and<br>

> > for<br>

> >    floats this would be float16, which doesn't help with bfloat16.<br>

> > <br>

> > 4. Try to expose the actual value. (I do not want to do this, but<br>

> > it<br>

> >    is probably a plausible extension with most other options, since<br>

> >    the other options can be the "default".)<br>

> > <br>

> > <br>

> > Within these options, there is one more difficulty. NumPy currently<br>

> > applies the same logic for:<br>

> > <br>

> >     np.array([1], dtype=bfloat16) + np.array(4., dtype=np.float64)<br>

> > <br>

> > which in my opinion is wrong (the second array is typed). We do<br>

> > have<br>

> > the same issue with deciding what to do in the future for NumPy<br>

> > itself.<br>

> > Right now I feel that new (user) DTypes should live in the future<br>

> > (whatever that future is).<br>

> > <br>

> <br>

> I agree. And I have a preference for option 1. Option 2 is too greedy<br>

> in<br>

> upcasting, the value-based casting is problematic in multiple ways<br>

> (e.g.,<br>

> hard for Numba because output dtype cannot be predicted from input<br>

> dtypes),<br>

> and option 4 is hard to understand a rationale for (maybe so the user<br>

> dtype<br>

> itself can implement option 3?).<br>

<br>

Yes, well, the "rational" for option 4 is that you expose everything<br>

that NumPy currently needs (assuming we make no changes). That would be<br>

the only way that allows a `bfloat16` to work exactly comparable to a<br>

`float16` as currently defined in NumPy.<br>

<br>

To be clear: It horrifies me, but defining a "better" way is much<br>

easier than trying to keep everything as (at least for now) while also<br>

thinking about how it should look like in the future (and making sure<br>

that user DTypes are ready for that future).<br>

<br>

My guess is, we can agree on aiming for Option 1 and trying to limit it<br>

to Python operators.  Unfortunately, only time will tell how feasible<br>

that will actually be.<br></blockquote><div><br></div><div>That sounds good.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

> > I have said previously, that we could distinguish this for<br>

> > universal<br>

> > functions.  But calls like `np.asarray(4.)` are common, and they<br>

> > would<br>

> > lose the information that `4.` was originally a Python float.<br>

> > <br>

> <br>

> Hopefully the future will have way fewer asarray calls in it.<br>

> Rejecting<br>

> scalar input to functions would be nice. This is what most other<br>

> array/tensor libraries do.<br>

> <br>

<br>

Well, right now NumPy has scalars (both ours and Python), and I would<br>

expect that changing that may well be more disruptive than changing the<br>

value based promotion (assuming we can add good FutureWarnings).<br>

<br>

I would probabaly need a bit convincing that forbidding `np.add(array,<br>

2)` is worth the trouble, but luckily that is probably an orthogonal<br>

question.  (The fact that we even accept 0-D arrays as "value based" is<br>

probably the biggest difficulty.)<br></blockquote><div><br></div><div>It probably isn't worth going through trouble for indeed. And yes, the "0-D arrays are special" is the more important issue.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

> <br>

> > <br>

> > So, recently, I was considering that a better option may be to<br>

> > limit<br>

> > this to math Python operators: +, -, /, **, ...<br>

> > <br>

> <br>

> +1<br>

> <br>

> This discussion may be relevant:<br>

> <a href="https://github.com/data-apis/array-api/issues/14" rel="noreferrer" target="_blank">https://github.com/data-apis/array-api/issues/14</a>.<br>

> <br>

<br>

I have browsed through it, I guess you also were thinking of limiting<br>

scalars to operators (although possibly even more broadly rather than<br>

just for promotion purposes).</blockquote><div><br></div><div>Indeed. `x + 1` must work, that's extremely common. `np.somefunc(x, 1)` is not common, and there's little downside (and lots of upside) in not supporting it if you'd design a new numpy-like library.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">  I am not sure I understand this:<br>

<br>

    Non-array ("scalar") operands are not permitted to participate in<br>

type promotion.<br>

<br>

Since they do participate also in JAX and in what I wrote here. They<br>

just participate in an abstract way. I.e. as `Floating` or `Integer`,<br>

but not like a specific float or integer.<br></blockquote><div><br></div><div>You're right, that sentence could use a tweak. I think the intent was to say that doing this in a multi-step way like</div><div>- cast scalar to array with some dtype (e.g. Python float becomes numpy float64)</div><div>- then apply the `array <op> array` casting rules to that resulting dtype<br></div><div> should not be done.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

> > Those are the places where it may make a difference to write:<br>

> > <br>

> >     arr + 4.         vs.    arr + bfloat16(4.)<br>

> >     int8_arr + 1     vs.    int8_arr + np.int8(1)<br>

> >     arr += 4.      (in-place may be the most significant use-case)<br>

> > <br>

> > while:<br>

> > <br>

> >     np.add(int8_arr, 1)    vs.   np.add(int8_arr, np.int8(1))<br>

> > <br>

> > is maybe less significant. On the other hand, it would add a subtle<br>

> > difference between operators vs. direct ufunc calls...<br>

> > <br>

> > <br>

> > In general, it may not matter: We can choose option 1 (which the<br>

> > bfloat16 does not have to use), and modify it if we ever change the<br>

> > logic in NumPy itself.  Basically, I will probably pick option 1<br>

> > for<br>

> > now and press on, and we can reconsider later.  And hope that it<br>

> > does<br>

> > not make things even more complicated than it is now.<br>

> > <br>

> > Or maybe better just limit it completely to always use the default<br>

> > for<br>

> > user DTypes?<br>

> > <br>

> <br>

> I'm not sure I understand why you like option 1 but want to give<br>

> user-defined dtypes the choice of opting out of it. Upcasting will<br>

> rarely<br>

> make sense for user-defined dtypes anyway.<br>

> <br>

<br>

I never meant this as an opt-out, the question is what you do if the<br>

user DType does not opt-in/define the operation.<br>

<br>

Basically, the we would promote with `Floating` here (or `PyFloating`,<br>

but there should be no difference; for now I will do PyFloating, but it<br>

should probably be changed later). I was hinting at provide a default<br>

fallback, so that if:<br>

<br>

    UserDtype + Floating -> Undefined/Error<br>

<br>

we automatically try the "default", e.g.:<br>

<br>

    UserDType + Float64 -> Something<br>

<br>

That would mean users don't have to worry about `Floating` itself.<br>

<br>

But I am not opinionated here, a user DType author should be able to<br>

quickly deal with either issue (that Float64 is undesired or that the<br>

Error is undesired if no "default" exists).  Maybe the error is more<br>

conservative/constructive though.<br></blockquote><div><br></div><div>I'd start with the error, and reconsider only if there's a practical problem with it. Going from error to fallback later is much easier than the other way around.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">  <br>

> > But I would be interested if the "limit to Python operators" is<br>

> > something we should aim for here.  This does make a small<br>

> > difference,<br>

> > because user DTypes could "live" in the future if we have an idea<br>

> > of<br>

> > how that future may look like.<br>

> > <br>

> <br>

> A future with:<br>

> - no array scalars<br>

> - 0-D arrays have the same casting rules as >=1-D arrays<br>

> - no value-based casting<br>

> would be quite nice. For "same kind" casting like<br>

> <br>

<br>

I don't think array-scalars really matter here, since they are typed<br>

and behave identical to 0-D arrays anyway.  We can have long opinion<br>

pieces on whether they should exist :).<br></blockquote><div><br></div><div>Let's not do that:) My summary would be: Travis regrets adding them, all other numpy-like libraries I know of decided not to have them, and that all worked out fine. Don't want to think about touching them in NumPy now.<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

   <br>

> <a href="https://data-apis.github.io/array-api/latest/API_specification/type_promotion.html" rel="noreferrer" target="_blank">https://data-apis.github.io/array-api/latest/API_specification/type_promotion.html</a><br>

> .<br>

> Mixed-kind casting isn't specified there, because it's too different<br>

> between libraries. The JAX design (<br>

> <a href="https://jax.readthedocs.io/en/latest/type_promotion.html" rel="noreferrer" target="_blank">https://jax.readthedocs.io/en/latest/type_promotion.html</a>)  seems<br>

> sensible<br>

> there.<br>

<br>

The JAX design is the "weak DType" design (when it comes to Python<br>

numbers). Although, the fact that a "weak" `complex` is sorted above<br>

all floats, means that `bfloat16_arr + 1j` will go to the default<br>

complex dtype as well.<br>

But yes, I like the "weak" approach, just think also JAX has some<br>

wrinkles to smoothen.<br>

<br>

<br>

There is a good deal more to this if you get user DTypes and I add one<br>

more important constraint that:<br>

<br>

    from my_extension_module import uint24<br>

<br>

must not change any existing code that does not explicitly use<br>

`uint24`.<br>

<br>

Then my current approach guarantees:<br>

<br>

    np.result_type(uint24, int48, int64) -> Error<br>

<br>

If `uint24` and `int48` do not know each other (`int64` is obviously<br>

right here, but it is tricky to be quite certain).<br></blockquote><div><br></div><div>That makes sense. I'd expect that to be extremely rare anyway. User-defined dtypes need to interact with Python types and NumPy dtypes, anything unknown should indeed just error.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

The other tricky example I have was:<br>

<br>

  The following becomes problematic (order does not matter):<br>

          uint24 +      int16  +           uint32  -> int64<br>

     <==      (uint24 + int16) + (uint24 + uint32) -> int64<br>

     <==                int32  +           uint32  -> int64<br>

<br>

With the addition that `uint24 + int32 -> int48` is defined the first<br>

could be expected to return `int48`, but actually getting there is<br>

tricky (and my current code will not).<br>

<br>

If promotion result of a user DType with a builtin one, can be a<br>

builtin one, then "ammending" the promotion with things like `uint24 +<br>

int32 -> int48` can lead to slightly surprising promotion results. <br>

This happens if the result of a promotion with another "category"<br>

(builtin) can be both a larger category or a lower one.<br></blockquote><div><br></div><div>I'm not sure I follow this. If uint24 and int48 both come from the same third-party package, there is still a problem here?</div><div><br></div><div> Cheers,</div><div>Ralf</div><div><br></div></div></div>