> https://data-apis.org/array-api/latest/API_specification/type_promotion.html

Oh, super nice - I hadn't seen these!

On Thu, 18 Nov 2021 at 16:58, Stephan Hoyer <shoyer@gmail.com> wrote:
There has been a recent community effort to standarize different NumPy-like array APIs in Python. The edge cases of dtype promotion are still pretty heterogeneous, but a subset of type promotion cases (mostly dtypes within the same family, like promotion between different float types) have shared semantics across all libraries: 
https://data-apis.org/array-api/latest/API_specification/type_promotion.html

These are the cases that I think would make sense for Python type checkers.

On Thu, Nov 18, 2021 at 7:54 AM Guido van Rossum <guido@python.org> wrote:
Calling out to Python during checking is not an option. But maybe we could extract a table with all the possibilities from the runtime at build / install time?

It’s disturbing that the rules are different for different libraries though. That will make this much harder.

On Thu, Nov 18, 2021 at 02:36 Matthew Rahtz via Typing-sig <typing-sig@python.org> wrote:
Thanks for reaching out, Sebastian!

My question is: Is there any chance you could call back into existing
Python exposed functionality to implement these operations?  Yes, that
could even have harmless "side effects" very occasionally.  And it
would mean using the library to type the library...

But to me it seems daunting to duplicate potentially very complex logic
that, at least for NumPy, is often available at runtime.

Hmm, that's a good point. I agree having to duplicate the logic is not ideal.

Enabling the type checker to call out to existing Python code is definitely an option we should consider, but I suspect there be dragons. Pradeep - as someone who actually knows how type checkers work, how viable do you think it would be?

NumPy promotion is potentially more complex than the above common-DType operation.

Oh, gosh, right, I'd forgotten about all the other types the NumPy can interact with. I guess this is mainly an argument against the 'brute force' solutions 1 and 2? (And I guess extra weight towards the point that duplicate logic for this would not be great?)





On Wed, 17 Nov 2021 at 22:24, Sebastian Berg <seberg@berkeley.edu> wrote:
Hey all,

> A quick summary of the talk is below.
>
> - How should we handle data type promotion in stubs?
>    - Option 1: One overload for each possible combination of dtypes
>    - Option 2: One overload for each result dtype
>    - Option 3: Don't handle type promotion
>    - Option 4: Use a dtype class hierarchy and exploit Union type
> operator
>    behaviour
>    - Option 5: Propose a type promotion operator
>    - Option 6: Propose a 'nearest common parent' operator
>    - Option 7: Propose a type lookup table operator
>    - Consensus during discussion was that since it looks like a new
> type
>    operator *would* be required, we should probably hold off on dealing
>    with this until the community shows a strong desire for this
> feature, and
>    in the meantime just not handle data type promotion in stubs.
>

Sorry for missing the discussion, coming over from NumPy to here.
Hopefully, my terminology matches yours pretty well and the below is
helpful or at least interesting :).


One I would like to note is that, unlike Python types, DTypes are
already typed.  In the sense that a library doing "promotion" should
have the logic for it explicitly spelled out somewhere.
The library needs to find the right C loop and result dtype before it
can do the actual operation, so this is a fairly distinct step [1].
That seems very unlike typical Python where you do not have a distinct
"promotion" implementation.

My question is:  Is there any chance you could call back into existing
Python exposed functionality to implement these operations?  Yes, that
could even have harmless "side effects" very occasionally.  And it
would mean using the library to type the library...

But to me it seems daunting to duplicate potentially very complex logic
that, at least for NumPy, is often available at runtime.


About "nearest common parent"
-----------------------------

What I/we now use in NumPy is a `__common_dtype__` binary operator
(only internal/available in C right now).  There are a few things to
note about it though:

* It is not always commutative/associative in NumPy (I do some extra
  stuff to hide that sometimes)
* "Common dtype" is not always equivalent to "promotion" in math
  functions.  (See below.)

So, effectively NumPy has this.  And it is based on a binary
classmethod.
This would probably solve the majority of annotations, but it would be
nice to not have to implement the logic twice?


About "promotion" in functions
------------------------------

NumPy promotion is potentially more complex than the above common-DType
operation.  Although I could provide API like:

    np.add.resolve_dtypes(DType1, DType2) -> DTypeX

but the general rules are tricky, and there are complicated corners:

    np.divide.resolve_dtypes(Timedelta, Number) -> Timedelta

    np.divide.resolve_dtypes(Timedelta, Timedelta) -> Number

Maybe "Timedelta" is a terrible outlier...  But the point is, that
there is a big potential complexity.  (I.e. also math functions that
have mixed float and integer inputs and/or outputs.)

Right now, I can't guarantee that the above would not have mild side-
effects (for NumPy)


About "value based" logic
-------------------------

In the discussion it was mentioned that NumPy's "value based" logic.
`1000` might be considered an `int16` or `uint16` but not an `int8`.

The one good part about this:  We would _really_ like to get rid of
that in NumPy.  Although that will still leave you with special
promotion/common-dtype rules when a Python integer/float/complex is
involved.  (Just the value should not matter, except that a bad value
could lead to an error, e.g. if the Python integer is too large.)

Cheers,

Sebastian


[1] OK, I may be extrapolating from NumPy.  I suppose some libraries
may not promote explicitly, but have `ndarray[Int8]` be an actual type.
But probably such a library could auto-generating a table of overloads?

[2] I am intentionally not using `np.result_type`, because NumPy dtypes
can be parametric, and `np.result_type("S3", "S4")` is "S4", but you
are primarily interested in `String, String -> String`.
_______________________________________________
Typing-sig mailing list -- typing-sig@python.org
To unsubscribe send an email to typing-sig-leave@python.org
https://mail.python.org/mailman3/lists/typing-sig.python.org/
Member address: mrahtz@google.com
_______________________________________________
Typing-sig mailing list -- typing-sig@python.org
To unsubscribe send an email to typing-sig-leave@python.org
https://mail.python.org/mailman3/lists/typing-sig.python.org/
Member address: guido@python.org

--
--Guido (mobile)
_______________________________________________
Typing-sig mailing list -- typing-sig@python.org
To unsubscribe send an email to typing-sig-leave@python.org
https://mail.python.org/mailman3/lists/typing-sig.python.org/
Member address: shoyer@gmail.com