<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Jan 26, 2021 at 10:21 PM Sebastian Berg <<a href="mailto:sebastian@sipsolutions.net">sebastian@sipsolutions.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Tue, 2021-01-26 at 06:11 +0100, Ralf Gommers wrote:<br>
> On Tue, Jan 26, 2021 at 2:01 AM Sebastian Berg < <br>
> <a href="mailto:sebastian@sipsolutions.net" target="_blank">sebastian@sipsolutions.net</a>><br>
> wrote:<br>
> <br>
> > Hi all,<br>
> > <br>
> > does anyone have a thought about how user DTypes (i.e. DTypes not<br>
> > currently part of NumPy) should interact with the "value based<br>
> > promotion" logic we currently have?<br>
> > For now I can just do anything, and we will find out later. And I<br>
> > will<br>
> > have to do something for now, basically with the hope that it all<br>
> > turns<br>
> > out all-right.<br>
> > <br>
> > But there are multiple options for both what to offer to user<br>
> > DTypes<br>
> > and where we want to move (I am using `bfloat16` as a potential<br>
> > DType<br>
> > here).<br>
> > <br>
> > 1. The "weak" dtype option (this is what JAX does), where:<br>
> > <br>
> > np.array([1], dtype=bfloat16) + 4.<br>
> > <br>
> > returns a bfloat16, because 4. is "lower" than all floating<br>
> > point types.<br>
> > In this scheme the user defined `bfloat16` knows that the input<br>
> > is a Python float, but it does not know its value (if an<br>
> > overflow occurs during conversion, it could warn or error but<br>
> > not upcast). For example `np.array([1], dtype=uint4) + 2**5`<br>
> > will try `uint4(2**5)` assuming it works.<br>
> > NumPy is different `2.**300` would ensure the result is a<br>
> > `float64`.<br>
> > <br>
> > If a DType does not make use of this, it would get the behaviour<br>
> > of option 2.<br>
> > <br>
> > 2. The "default" DType option: np.array([1], dtype=bfloat16) + 4.<br>
> > is<br>
> > always the same as `bfloat16 + float64 -> float64`.<br>
> > <br>
> > 3. Use whatever NumPy considers the "smallest appropriate dtype".<br>
> > This will not always work correctly for unsigned integers, and<br>
> > for<br>
> > floats this would be float16, which doesn't help with bfloat16.<br>
> > <br>
> > 4. Try to expose the actual value. (I do not want to do this, but<br>
> > it<br>
> > is probably a plausible extension with most other options, since<br>
> > the other options can be the "default".)<br>
> > <br>
> > <br>
> > Within these options, there is one more difficulty. NumPy currently<br>
> > applies the same logic for:<br>
> > <br>
> > np.array([1], dtype=bfloat16) + np.array(4., dtype=np.float64)<br>
> > <br>
> > which in my opinion is wrong (the second array is typed). We do<br>
> > have<br>
> > the same issue with deciding what to do in the future for NumPy<br>
> > itself.<br>
> > Right now I feel that new (user) DTypes should live in the future<br>
> > (whatever that future is).<br>
> > <br>
> <br>
> I agree. And I have a preference for option 1. Option 2 is too greedy<br>
> in<br>
> upcasting, the value-based casting is problematic in multiple ways<br>
> (e.g.,<br>
> hard for Numba because output dtype cannot be predicted from input<br>
> dtypes),<br>
> and option 4 is hard to understand a rationale for (maybe so the user<br>
> dtype<br>
> itself can implement option 3?).<br>
<br>
Yes, well, the "rational" for option 4 is that you expose everything<br>
that NumPy currently needs (assuming we make no changes). That would be<br>
the only way that allows a `bfloat16` to work exactly comparable to a<br>
`float16` as currently defined in NumPy.<br>
<br>
To be clear: It horrifies me, but defining a "better" way is much<br>
easier than trying to keep everything as (at least for now) while also<br>
thinking about how it should look like in the future (and making sure<br>
that user DTypes are ready for that future).<br>
<br>
My guess is, we can agree on aiming for Option 1 and trying to limit it<br>
to Python operators. Unfortunately, only time will tell how feasible<br>
that will actually be.<br></blockquote><div><br></div><div>That sounds good.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
> > I have said previously, that we could distinguish this for<br>
> > universal<br>
> > functions. But calls like `np.asarray(4.)` are common, and they<br>
> > would<br>
> > lose the information that `4.` was originally a Python float.<br>
> > <br>
> <br>
> Hopefully the future will have way fewer asarray calls in it.<br>
> Rejecting<br>
> scalar input to functions would be nice. This is what most other<br>
> array/tensor libraries do.<br>
> <br>
<br>
Well, right now NumPy has scalars (both ours and Python), and I would<br>
expect that changing that may well be more disruptive than changing the<br>
value based promotion (assuming we can add good FutureWarnings).<br>
<br>
I would probabaly need a bit convincing that forbidding `np.add(array,<br>
2)` is worth the trouble, but luckily that is probably an orthogonal<br>
question. (The fact that we even accept 0-D arrays as "value based" is<br>
probably the biggest difficulty.)<br></blockquote><div><br></div><div>It probably isn't worth going through trouble for indeed. And yes, the "0-D arrays are special" is the more important issue.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
> <br>
> > <br>
> > So, recently, I was considering that a better option may be to<br>
> > limit<br>
> > this to math Python operators: +, -, /, **, ...<br>
> > <br>
> <br>
> +1<br>
> <br>
> This discussion may be relevant:<br>
> <a href="https://github.com/data-apis/array-api/issues/14" rel="noreferrer" target="_blank">https://github.com/data-apis/array-api/issues/14</a>.<br>
> <br>
<br>
I have browsed through it, I guess you also were thinking of limiting<br>
scalars to operators (although possibly even more broadly rather than<br>
just for promotion purposes).</blockquote><div><br></div><div>Indeed. `x + 1` must work, that's extremely common. `np.somefunc(x, 1)` is not common, and there's little downside (and lots of upside) in not supporting it if you'd design a new numpy-like library.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> I am not sure I understand this:<br>
<br>
Non-array ("scalar") operands are not permitted to participate in<br>
type promotion.<br>
<br>
Since they do participate also in JAX and in what I wrote here. They<br>
just participate in an abstract way. I.e. as `Floating` or `Integer`,<br>
but not like a specific float or integer.<br></blockquote><div><br></div><div>You're right, that sentence could use a tweak. I think the intent was to say that doing this in a multi-step way like</div><div>- cast scalar to array with some dtype (e.g. Python float becomes numpy float64)</div><div>- then apply the `array <op> array` casting rules to that resulting dtype<br></div><div> should not be done.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
> > Those are the places where it may make a difference to write:<br>
> > <br>
> > arr + 4. vs. arr + bfloat16(4.)<br>
> > int8_arr + 1 vs. int8_arr + np.int8(1)<br>
> > arr += 4. (in-place may be the most significant use-case)<br>
> > <br>
> > while:<br>
> > <br>
> > np.add(int8_arr, 1) vs. np.add(int8_arr, np.int8(1))<br>
> > <br>
> > is maybe less significant. On the other hand, it would add a subtle<br>
> > difference between operators vs. direct ufunc calls...<br>
> > <br>
> > <br>
> > In general, it may not matter: We can choose option 1 (which the<br>
> > bfloat16 does not have to use), and modify it if we ever change the<br>
> > logic in NumPy itself. Basically, I will probably pick option 1<br>
> > for<br>
> > now and press on, and we can reconsider later. And hope that it<br>
> > does<br>
> > not make things even more complicated than it is now.<br>
> > <br>
> > Or maybe better just limit it completely to always use the default<br>
> > for<br>
> > user DTypes?<br>
> > <br>
> <br>
> I'm not sure I understand why you like option 1 but want to give<br>
> user-defined dtypes the choice of opting out of it. Upcasting will<br>
> rarely<br>
> make sense for user-defined dtypes anyway.<br>
> <br>
<br>
I never meant this as an opt-out, the question is what you do if the<br>
user DType does not opt-in/define the operation.<br>
<br>
Basically, the we would promote with `Floating` here (or `PyFloating`,<br>
but there should be no difference; for now I will do PyFloating, but it<br>
should probably be changed later). I was hinting at provide a default<br>
fallback, so that if:<br>
<br>
UserDtype + Floating -> Undefined/Error<br>
<br>
we automatically try the "default", e.g.:<br>
<br>
UserDType + Float64 -> Something<br>
<br>
That would mean users don't have to worry about `Floating` itself.<br>
<br>
But I am not opinionated here, a user DType author should be able to<br>
quickly deal with either issue (that Float64 is undesired or that the<br>
Error is undesired if no "default" exists). Maybe the error is more<br>
conservative/constructive though.<br></blockquote><div><br></div><div>I'd start with the error, and reconsider only if there's a practical problem with it. Going from error to fallback later is much easier than the other way around.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <br>
> > But I would be interested if the "limit to Python operators" is<br>
> > something we should aim for here. This does make a small<br>
> > difference,<br>
> > because user DTypes could "live" in the future if we have an idea<br>
> > of<br>
> > how that future may look like.<br>
> > <br>
> <br>
> A future with:<br>
> - no array scalars<br>
> - 0-D arrays have the same casting rules as >=1-D arrays<br>
> - no value-based casting<br>
> would be quite nice. For "same kind" casting like<br>
> <br>
<br>
I don't think array-scalars really matter here, since they are typed<br>
and behave identical to 0-D arrays anyway. We can have long opinion<br>
pieces on whether they should exist :).<br></blockquote><div><br></div><div>Let's not do that:) My summary would be: Travis regrets adding them, all other numpy-like libraries I know of decided not to have them, and that all worked out fine. Don't want to think about touching them in NumPy now.<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
> <a href="https://data-apis.github.io/array-api/latest/API_specification/type_promotion.html" rel="noreferrer" target="_blank">https://data-apis.github.io/array-api/latest/API_specification/type_promotion.html</a><br>
> .<br>
> Mixed-kind casting isn't specified there, because it's too different<br>
> between libraries. The JAX design (<br>
> <a href="https://jax.readthedocs.io/en/latest/type_promotion.html" rel="noreferrer" target="_blank">https://jax.readthedocs.io/en/latest/type_promotion.html</a>) seems<br>
> sensible<br>
> there.<br>
<br>
The JAX design is the "weak DType" design (when it comes to Python<br>
numbers). Although, the fact that a "weak" `complex` is sorted above<br>
all floats, means that `bfloat16_arr + 1j` will go to the default<br>
complex dtype as well.<br>
But yes, I like the "weak" approach, just think also JAX has some<br>
wrinkles to smoothen.<br>
<br>
<br>
There is a good deal more to this if you get user DTypes and I add one<br>
more important constraint that:<br>
<br>
from my_extension_module import uint24<br>
<br>
must not change any existing code that does not explicitly use<br>
`uint24`.<br>
<br>
Then my current approach guarantees:<br>
<br>
np.result_type(uint24, int48, int64) -> Error<br>
<br>
If `uint24` and `int48` do not know each other (`int64` is obviously<br>
right here, but it is tricky to be quite certain).<br></blockquote><div><br></div><div>That makes sense. I'd expect that to be extremely rare anyway. User-defined dtypes need to interact with Python types and NumPy dtypes, anything unknown should indeed just error.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
The other tricky example I have was:<br>
<br>
The following becomes problematic (order does not matter):<br>
uint24 + int16 + uint32 -> int64<br>
<== (uint24 + int16) + (uint24 + uint32) -> int64<br>
<== int32 + uint32 -> int64<br>
<br>
With the addition that `uint24 + int32 -> int48` is defined the first<br>
could be expected to return `int48`, but actually getting there is<br>
tricky (and my current code will not).<br>
<br>
If promotion result of a user DType with a builtin one, can be a<br>
builtin one, then "ammending" the promotion with things like `uint24 +<br>
int32 -> int48` can lead to slightly surprising promotion results. <br>
This happens if the result of a promotion with another "category"<br>
(builtin) can be both a larger category or a lower one.<br></blockquote><div><br></div><div>I'm not sure I follow this. If uint24 and int48 both come from the same third-party package, there is still a problem here?</div><div><br></div><div> Cheers,</div><div>Ralf</div><div><br></div></div></div>