How to annotate modifications for function arguments with `ParamSpec` and `TypeVar`
Not sure this is the right place to ask, but here we go... 😬 I want to be able to annotate something that takes a function and creates a new one that alters the argument types it accepts. A quick example, using Ray https://github.com/ray-project/ray: import rayray.init() @ray.remotedef do_things(x: int, y: float): return x * y @ray.remotedef do_more_things(name: str, value: float): return f"{name}: {value}" pure_value = do_things(x=3, y=2.2) # pure_value is type: float ref_value = do_things.remote(x=3, y=2.2) # ref_value is type: ObjectRef[float] pure_second_value = do_more_things(name="Foo", value=pure_value) # this is fineref_second_value = do_more_things.remote(name="Foo", value=ref_value) # this should be fine So, do_things() and do_more_things() should keep their signature. But their remote counterparts should have a signature that allows the original types or a wrapper ObjectRef[OriginalType] for each argument: do_things.remote(x: int | ObjectRef[int], y: float | ObjectRef[float]) -> ObjectRef[float]: ...do_more_things.remote(name: str | ObjectRef[str], value: float | ObjectRef[float]) -> ObjectRef[str]: ... Now, I know type annotating this in this particular API design is probably difficult, and possibly not easily supported yet. I was thinking of an alternative API that could take advantage of ParamSpec, TypeVars with overloads and/or TypeVarTuple. But I still can't find a way to get all the desired features together. Longer discussion and ideas I tried: https://github.com/python/typing/discussions/1163 --- Assuming this is currently not possible, what would be needed to make it possible? Would some way of achieving this be acceptable? And if so, what would be the best approach to make it possible? I'm not sure what's the process, but maybe there could be a way to sponsor someone with the right expertise here to tackle it, I imagine it would require a PEP, work on mypy, not sure what else, but maybe I'm being naive in some way and it would require a different approach. Thanks for your help and ideas/feedback! Sebastián
The current Python type system isn't powerful enough to handle something as complicated as `ray.remote`. My team's code base makes extensive use of the `ray` library, so I'm very familiar with this limitation, and I've spent quite a bit of time thinking about (and even prototyping) some potential solutions. It's a really hard problem given that `ray.remote` can be applied to an individual function or an entire class. If applied to a class, it modifies most of the methods in that class including inherited methods, but avoids modifying dundered methods. And as you point out, it modifies not only the return types of these modified functions/methods but also the parameter types. I've played around with the idea of a "class transform", a "function transform", and a "parameter transform" that specifies how a type checker should modify the types of a class, function or individual parameters when the transform is applied to them. In some ways, this is similar to the `dataclass_transform` described in PEP 681, but it's more general. I think it's theoretically possible to extend the type system to support this, but the difficult part is working out all of the (complex) rules and requirements. And then convincing ourselves that all of this complexity is justified given that we don't have many compelling use cases for it. -Eric -- Eric Traut Contributor to Pyright & Pylance Microsoft
Get it. Thanks a lot for the feedback Eric! It's good to know at least I'm
not crazy and it's indeed a complex problem.
Now, alternatively, is there any type of API design that could be
changed/implemented in Ray that you feel would be easier to support and
justify?
I was playing around with the idea of having a function that returned
another one, that when called would be the remote version:
remotify(normal_func)(name="Morty")
With this design it would be possible to support keyword arguments with
ParamSpec OR the modification of the types of the arguments with positional
arguments-only, overloads, and changing the type of each arg (with a
TypeVar("_T")) to RemoteRef[_T].
Thinking of just that simple use case, only functions (not classes), we are
so close to being able to support that! But currently it's only possible to
support one or the other.
I was thinking, if there was a way to declare not the Union[] of some type
declarations but the "intersection", declaring that something must support
both things (so, an AND instead of an OR for the types), then the
idea/technique above could potentially work, for keyword arguments and for
modifying the input types.
The funny thing is that currently there's some kind of a way to declare
that "intersection" of types with Protocols and multiple inheritance: a
subclass must implement all the inherited Protocols.
But I didn't find a way to make it work without conflicts using multiple
definitions for __call__().
I just wanted to share the extra ideas in case you (or anyone here) sees a
way to make it work, or an easier path to move forward and support
something in this direction.
Thanks!
Sebastián
On Tue, Jun 7, 2022, 21:10 Eric Traut
The current Python type system isn't powerful enough to handle something as complicated as `ray.remote`. My team's code base makes extensive use of the `ray` library, so I'm very familiar with this limitation, and I've spent quite a bit of time thinking about (and even prototyping) some potential solutions.
It's a really hard problem given that `ray.remote` can be applied to an individual function or an entire class. If applied to a class, it modifies most of the methods in that class including inherited methods, but avoids modifying dundered methods. And as you point out, it modifies not only the return types of these modified functions/methods but also the parameter types.
I've played around with the idea of a "class transform", a "function transform", and a "parameter transform" that specifies how a type checker should modify the types of a class, function or individual parameters when the transform is applied to them. In some ways, this is similar to the `dataclass_transform` described in PEP 681, but it's more general.
I think it's theoretically possible to extend the type system to support this, but the difficult part is working out all of the (complex) rules and requirements. And then convincing ourselves that all of this complexity is justified given that we don't have many compelling use cases for it.
-Eric
-- Eric Traut Contributor to Pyright & Pylance Microsoft _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: tiangolo@gmail.com
I kept thinking about this, in particular how classes inheriting from
Protocols are kind of a type that declares that something must match two
things (like the equivalent to an Intersection instead of a Union).
I was playing with type definitions and some iterations of those ideas. I'm
copying the example code I was playing with below.
from typing import Awaitable, Callable, Generic, TypeVar, Union
from typing_extensions import ParamSpec, Protocol
T0 = TypeVar("T0")
R = TypeVar("R")
_Params = ParamSpec("_Params")
class ObjectRef(Awaitable[R]):
pass
class CallableArgs0(Protocol[T0, R]):
def __call__(
self,
__arg0: T0,
) -> R:
...
class CallableKArgs(Protocol[_Params, R]):
def __call__(
self,
*args: _Params.args,
**kwargs: _Params.kwargs,
) -> R:
...
class Callable0Mix(
Protocol,
CallableArgs0[T0, R],
CallableKArgs[_Params, R],
):
pass
class RemoteFunctionKW(Generic[_Params, R]):
def __init__(self, function: Callable[_Params, R]) -> None:
pass
def remote(
self,
*args: _Params.args,
**kwargs: _Params.kwargs,
) -> "ObjectRef[R]":
...
class RemoteFunction0(Generic[R, T0]):
def __init__(self, function: Callable[[T0], R]) -> None:
pass
def remote(
self,
__arg0: "Union[T0, ObjectRef[T0]]",
) -> "ObjectRef[R]":
...
class RemoteFunction0Mix(RemoteFunction0[R, T0], RemoteFunctionKW[_Params, R
]):
pass
def remote(
__function: Callable0Mix[T0, R, _Params]
) -> RemoteFunction0Mix[T0, R, _Params]:
...
@remote
def do_things(x: int):
return x
ref_value = do_things.remote(3)
do_things.remote(ref_value)
I was wondering if there's something inherently flawed with my approach, or
if it could make sense to imagine something like Pyright and mypy analyzing
and "understanding" that both protocols refer to the same thing.
And understanding that the types of arguments should be read from the
protocol with specific arguments, and the possible keywords should be read
from the protocol with the ParamSpec.
Currently, Pyright (VS Code) seems to take into account only the first
protocol. If it's the one with specific args, it checks the types, if it's
the one with ParamSpec, it checks the keyword argument names.
This is a simplification to reduce the problem to the minimum, using only a
single argument. Multiple arguments would be supported with overloads for
remote() and multiple flavors of those same generic RemoteFunction classes.
Is there something obviously wrong with this idea? Or could it make sense
in some way (maybe with modifications)?
On Wed, Jun 15, 2022 at 8:56 AM Sebastián Ramírez
Get it. Thanks a lot for the feedback Eric! It's good to know at least I'm not crazy and it's indeed a complex problem.
Now, alternatively, is there any type of API design that could be changed/implemented in Ray that you feel would be easier to support and justify?
I was playing around with the idea of having a function that returned another one, that when called would be the remote version:
remotify(normal_func)(name="Morty")
With this design it would be possible to support keyword arguments with ParamSpec OR the modification of the types of the arguments with positional arguments-only, overloads, and changing the type of each arg (with a TypeVar("_T")) to RemoteRef[_T].
Thinking of just that simple use case, only functions (not classes), we are so close to being able to support that! But currently it's only possible to support one or the other.
I was thinking, if there was a way to declare not the Union[] of some type declarations but the "intersection", declaring that something must support both things (so, an AND instead of an OR for the types), then the idea/technique above could potentially work, for keyword arguments and for modifying the input types.
The funny thing is that currently there's some kind of a way to declare that "intersection" of types with Protocols and multiple inheritance: a subclass must implement all the inherited Protocols.
But I didn't find a way to make it work without conflicts using multiple definitions for __call__().
I just wanted to share the extra ideas in case you (or anyone here) sees a way to make it work, or an easier path to move forward and support something in this direction.
Thanks!
Sebastián
On Tue, Jun 7, 2022, 21:10 Eric Traut
wrote: The current Python type system isn't powerful enough to handle something as complicated as `ray.remote`. My team's code base makes extensive use of the `ray` library, so I'm very familiar with this limitation, and I've spent quite a bit of time thinking about (and even prototyping) some potential solutions.
It's a really hard problem given that `ray.remote` can be applied to an individual function or an entire class. If applied to a class, it modifies most of the methods in that class including inherited methods, but avoids modifying dundered methods. And as you point out, it modifies not only the return types of these modified functions/methods but also the parameter types.
I've played around with the idea of a "class transform", a "function transform", and a "parameter transform" that specifies how a type checker should modify the types of a class, function or individual parameters when the transform is applied to them. In some ways, this is similar to the `dataclass_transform` described in PEP 681, but it's more general.
I think it's theoretically possible to extend the type system to support this, but the difficult part is working out all of the (complex) rules and requirements. And then convincing ourselves that all of this complexity is justified given that we don't have many compelling use cases for it.
-Eric
-- Eric Traut Contributor to Pyright & Pylance Microsoft _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: tiangolo@gmail.com
Currently, Pyright (VS Code) seems to take into account only the first protocol
It takes into account method reference order (MRO). That typically means the first protocol, although it might be different if the method comes from a base class of that protocol. At runtime, you cannot inherit from multiple protocol classes like this. If you try to run your code, you'll see that you receive a "TypeError: Cannot create a consistent method resolution" from the runtime.
Is there something obviously wrong with this idea?
Yeah, I think it breaks both runtime assumptions and core assumptions about the way Protocols are intended to work. -Eric
Get it, thanks for the feedback!
On Tue, Jun 28, 2022, 20:48 Eric Traut
Currently, Pyright (VS Code) seems to take into account only the first protocol
It takes into account method reference order (MRO). That typically means the first protocol, although it might be different if the method comes from a base class of that protocol.
At runtime, you cannot inherit from multiple protocol classes like this. If you try to run your code, you'll see that you receive a "TypeError: Cannot create a consistent method resolution" from the runtime.
Is there something obviously wrong with this idea?
Yeah, I think it breaks both runtime assumptions and core assumptions about the way Protocols are intended to work.
-Eric _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: tiangolo@gmail.com
participants (2)
-
Eric Traut
-
Sebastián Ramírez