<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Dec 9, 2020 at 5:22 PM Sebastian Berg <<a href="mailto:sebastian@sipsolutions.net">sebastian@sipsolutions.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi all,<br>

<br>

Sorry that this will again be a bit complicated again :(. In brief:<br>

<br>

* I would like to pass around scalars in some (partially new) C-API<br>

  to implement value-based promotion.<br>

* There are some subtle commutativity issues with promotion.<br>

  Commutativity may change in that case (with respect of value based<br>

  promotion, probably to the better normally). [0]<br>

<br>

<br>

In the past days, I have been looking into implementing value-based<br>

promotion in a way that I had done it for Prototype before.<br>

The idea was that NEP 42, allows for the creation of DType dynamically,<br>

which does allow very powerful value based promotion/casting.<br>

<br>

But I decided there are too many quirks with creating type instances<br>

dynamically (potentially very often) just to pass around one additional<br>

piece of information.<br>

That approach was far more powerful, but it is power and complexity<br>

that we do not require, given that:<br>

<br>

* Value based promotion is only used for a mix of scalars and arrays<br>

  (where "scalar" is annoyingly defined as 0-D at the moment)<br>

* I assume it is only relevant for `np.result_type` and promotion<br>

  in ufuncs (which often uses `np.result_type`).<br>

  `np.can_cast` has such behaviour, but I think it is easier [1].<br>

  We could implement more powerful "value based" logic, but I doubt<br>

  it is worthwhile.<br>

* This is already stretching the Python C-API beyond its limits.<br>

<br>

<br>

So I will suggest this instead which *must* modify some (poorly<br>

defined) current behaviour:<br>

<br>

1. We always evaluate concrete DTypes first in promotion, this means<br>

   that in rare cases the non-commutativity of promotion may change<br>

   the result dtype:<br>

<br>

       np.result_type(-1, 2**16, np.float32)<br>

<br>

   The same can also happens when you reorder the normal dtypes:<br>

<br>

       np.result_type(np.int8, np.uint16, np.float32)<br>

       np.result_type(np.float32, np.int8, np.uint16)<br>

<br>

   in both cases the `np.float32` is moved to the front<br>

<br>

2. If we reorder the above operation, we can define that we never<br>

   promote two "scalar values". Instead we convert both to a<br>

   concrete one first.  This makes it effectively like:<br>

<br>

       np.result_type(np.array(-1).dtype, np.array(2**16).dtype)<br>

<br>

   This means that we never have to deal with promoting two values.<br>

<br>

3. We need additional private API (we were always going to need some<br>

   additional API); That API could become public:<br>

<br>

   * Convert a single value into a concrete dtype, you could say<br>

     the same as `self.common_dtype(None)`, but a dedicated function<br>

     seems simpler. A dtype like this will never use `common_dtype()`.<br>

   * `common_dtype_with_scalar(self, other, scalar)` (note that<br>

     only one of the DTypes can have a scalar).<br>

     As a fallback, this function can be implemented by converting<br>

     to the concrete DType and retrying with the normal `common_dtype`.<br>

<br>

   (At leas the second slot must be made public we are to allow value<br>

   based promotion for user DTypes. I expect we will, but it is not<br>

   particularly important to me right now.)<br>

<br>

4. Our public API (including new C-API) has to expose and take the<br>

   scalar values. That means promotion in ufuncs will get DTypes and<br>

   `scalar_values`, although those should normally be `NULL` (or None).<br>

<br>

   In future python API, this is probably acceptable:<br>

<br>

        np.result_type([t if v is None else v for t, v in zip(dtypes, scalar_values)])<br>

<br>

   In C, we need to expose a function below `result_type` which<br>

   accepts both the scalar values and DTypes explicitly.<br>

<br>

5. For the future: As said many times, I would like to deprecate<br>

   using value based promotion for anything except Python core types.<br>

   That just seems wrong and confusing.<br></blockquote><div><br></div><div>I agree with this. Value-based promotion was never a great idea, so let's try to keep it as minimal as possible. I'm not even sure what kind of value-based promotion for non Python builtin types is happening now (?).<br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

   My only problem is that while I can warn (possibly sometimes too<br>

   often) when behaviour will change.  I do not have a good idea about<br>

   silencing that warning.<br></blockquote><div><br></div><div>Do you see a real issue with this somewhere, or is it all just corner cases? In that case no warning seems okay.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

<br>

Note that this affects NEP 42 (a little bit). NEP 42 currently makes a<br>

nod towards the dynamic type creation, but falls short of actually<br>

defining it. <br></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

So These rules have to be incorporated, but IMO they do not affect the<br>

general design choices in the NEP.<br>

<br>

<br>

There is probably even more complexity to be found here, but for now<br>

the above seems to be at least good enough to make headway...<br>

<br>

<br>

Any thoughts or clarity remaining that I can try to confuse? :)<br></blockquote><div><br></div><div>My main question is why you're considering both deprecating and expanding public API (in points 3 and 4). If you have a choice, keep everything private I'd say.</div><div><br></div><div>My other question is: this is a complex story, it all sounds reasonable but do you need more feedback than "sounds reasonable"? <br></div><div><br></div><div>Cheers,<br></div><div>Ralf</div><div><br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

Cheers,<br>

<br>

Sebastian<br>

<br>

<br>

<br>

[0] We could use the reordering trick also for concrete DTypes,<br>

although, that would require introducing some kind of priority... I do<br>

not like that much as public API, but it might be something to look at<br>

internally or for types deriving from the builtin abstract DTypes:<br>

    * inexact<br>

    * other<br>

<br>

Just evaluating all `inexact` first would probably solve our<br>

commutativity issues.<br>

<br>

[1] NumPy uses `np.can_cast(value, dtype)` also. For example:<br>

<br>

    np.can_cast(np.array(1., dtype=np.float64), np.float32, casting="safe")<br>

<br>

returns True. My working hypothesis is that `np.can_cast` as above is<br>

just a side battle.  I.e. we can either:<br>

<br>

* Flip the switch on it (can-cast does no value based logic, even<br>

though we use it internally, we do not need it).<br>

* Or, we can implement those cases of `np.can_cast` by using promotion.<br>

<br>

The first one is tempting, but I assume we should go with the second<br>

since it preserves behaviour and is slightly more powerful.<br>

_______________________________________________<br>

NumPy-Discussion mailing list<br>

<a href="mailto:NumPy-Discussion@python.org" target="_blank">NumPy-Discussion@python.org</a><br>

<a href="https://mail.python.org/mailman/listinfo/numpy-discussion" rel="noreferrer" target="_blank">https://mail.python.org/mailman/listinfo/numpy-discussion</a><br>

</blockquote></div></div>