Type hinting **kwargs

I originally posted this on python-ideas; Guido suggested moving it here. I'm encountering a situation where it would be far better for me to use **kwargs (multiple functions taking the same set of parameters) to avoid repeating multiple function definitions, explicitly enumerating all parameters for each function definition. Therefore, I'd like to have some meaningful type hint for **kwargs. First, I thought TypedDict would be a great fit, until I found this passage in PEP 589:
In the mypy issue discussion, there seems to be two issues: 1. Ambiguity of whether the type hint applies to the **kwargs dict vs. each value in the dict. I think this could be a red herring; kwargs presents as a dict and I think TypedDict can unambiguously represent the structure of that dict. I don't see how it should be different than any other dict parameter being defined as a function parameter. 2. A question of whether one needs to assume that for a TypedDict, the type validator should assume total=False. I think what keys are required should be explicitly determined by the TypedDict (required keys inferred from total). I suspect the issue could be that mypy may not be able to determine this during static type analysis? In summary, **kwargs is a dict; TypedDict can specify its structure; static type checker limitations should not necessarily restrict its use in this way. Therefore, I propose a short PEP be drafted to codify the validity of using TypedDict to specify the type of a **kwargs argument. Guido noted in a response:
Is there anything in PEP 646 that would prevent **kwargs being annotated as I'm suggesting? I'm not seeing anything obvious. Paul

PEP 484 says that a type argument on a **kwargs parameter annotates the _value_ type of the dictionary. In other words, `**kwargs: X` indicates that kwargs is of type `Dict[str, X]`. If X is a TypedDict, then this would indicate that kwargs is a dictionary whose keys are strings and values are all typed dictionaries. I don't see any ambiguity here, but it's clearly not the behavior that you want. I think you're proposing to make an exception to the PEP 484 rule specifically in the case that X evaluates to a TypedDict. I think that's an ugly inconsistency, and it sets a dangerous precedent. One could make the argument that `Dict` should be exempt from the normal PEP 484 rules here as well. It would mean that there's no way to specify the case where you intend for kwargs to be a dictionary whose values are all type dictionaries. It's also a change that would break backward compatibility, since the rules for PEP 484 are well established. It also opens up questions like what if X is a union that includes a TypedDict or multiple TypedDicts? For all of these reasons, I think this proposal is a no-go. If I understand your motivation correctly, you are designing an interface where you have (presumably a large number of) keyword parameters that are shared across many methods. Have you considered changing your interface such that you don't expose individual keyword parameters and instead expose a single parameter that accepts a TypedDict? I realize this would change the way callers invoke these methods (e.g. `foo(a=3, b=5) would need to be changed to `foo({"a": 1, "b": 5})`), but it would be type safe and would work with existing Python type checkers. -- Eric Traut Contributor to Pyright and Pylance Microsoft Corp.

I agree with Eric that `**kwargs: Foo` means that `kwargs` is a dictionary of Foo values, as per PEP 484. Changing that would be backward-incompatible, and special-casing it for TypedDict alone would be clumsy. However, it *is* useful to be able to specify the individual types of the keyword-only parameters we want to accept. *Proposal*: We allow typing `**kwargs: **KwargsTypedDict`. Required keyword-only parameters will be required fields of the TypedDict. Keyword-only parameters with default values will be non-required fields of the TypedDict. I wanted a real-world example to motivate this and thought of `json.loads` and co: ```python # Simplified and modified slightly from `json.loads` and `json.load`, which # share the same keyword-only parameters. # Before. def loads( s: Union[str, bytes], *, # No default value for this one. json_decoder: Type[JSONDecoder], # Has a default value. parse_int: Optional[Callable[[str], Any]] = ..., # And a bunch of other keyword-only parameters. ) -> JSON: ... def load( fp: SupportsRead[Union[str, bytes]], *, json_decoder: Type[JSONDecoder], parse_int: Optional[Callable[[str], Any]] = ..., ) -> JSON: ... class JSONDecoder: def __init__( self, *, parse_int: Optional[Callable[[str], Any]] = ..., ) -> None: ... ``` Note that `loads` and `load` internally just construct a `JSONDecoder` by passing on their kwargs to the given `json_decoder`. Clearly, these functions share the same keyword-only parameters. ```python # These keyword-only parameters had default values. So, they are non-required # fields in the TypedDict. class JSONDecoderKwargs(TypedDict, total=False): parse_int: Callable[[str], Any] # And a bunch of others. # `json_decoder` was a required keyword-only parameter (in my example above). # So, it goes in a total TypedDict. # We also inherit the other fields (preserving their non-requiredness). class LoadKwargs(JSONDecoderKwargs, total=True): json_decoder: Type[JSONDecoder] # After. def loads( s: Union[str, bytes], **kwargs: **LoadKwargs, ) -> JSON: ... def load( fp: SupportsRead[Union[str, bytes]], **kwargs: **LoadKwargs, ) -> JSON: ... class JSONDecoder: def __init__( self, **kwargs: **JSONDecoderKwargs, ) -> None: ... # valid loads(s, json_decoder=MyJsonDecoder) loads(s, json_decoder=MyJsonDecoder, parse_int=my_parse_int) # invalid: missing keyword-only parameter `class_info`. loads(s) # invalid: type mismatch loads(s, json_decoder=1) # invalid: unexpected argument `foo`. loads(s, json_decoder=MyJsonDecoder, foo=2) ``` Other such cases off the top of my head include + `subprocess.run`, `Popen.__init__`, and friends. + `sort` and `sorted`, which accept the same keyword-only parameters Things to consider: 1. This is backward-compatible because it preserves the `**kwargs: int` behavior of PEP 484. 2. `**kwargs: **KwargsTypedDict` requires changes to the parser. If that's not worth it, we could settle for something like `**kwargs: UnpackTypedDict[KwargsTypedDict]`. This is analogous to PEP 646 making `Unpack[Ts]` a synonym for `*Ts` until syntax support lands. 3. We may also want to allow arbitrary keyword parameters beyond the ones specifically named. That is, how would we type the following using the above TypedDict proposal? ``` def foo(*, required: int, non_required: str=..., **kwargs: int) -> None: ... # valid. foo(required=1) # valid. foo(required=1, extra=7) # invalid: expected int, got str. foo(required=1, extra="wrong type") ``` One option is to simply require users to type out the named keyword-only parameters by hand (as done above for `foo`). This wouldn't allow multiple such functions to share the same keyword-only parameters, but would not require any other changes. A more long-term option is to allow open-ended TypedDicts - ones that allows arbitrary fields other than the named fields. I believe there was some discussion about this earlier, but there was no resulting PEP there: https://mail.python.org/archives/list/typing-sig@python.org/thread/66RITIHDQ.... This might be impractical to wait for. Yet another option is to always allow arbitrary kwargs beyond the fields in the TypedDict. I'm against this because it won't let us specify that we want a finite set of keyword-only parameters, like in the `json.loads` example above. In any case, this is a backward-compatible feature that we can defer for this discussion. On Sun, Feb 7, 2021 at 3:08 PM Eric Traut <eric@traut.com> wrote:
-- S Pradeep Kumar

Some other considerations: + Using `kwargs` within the function body: Within the function body, `kwargs` is treated as having type `Kwargs`. ``` class MovieKwargs(TypedDict): name: str year: int def foo(**kwargs: **MovieKwargs) -> None: name: str = kwargs["name"] year: int = kwargs["year"] # => MovieKwargs reveal_type(kwargs) # invalid because `name` is `str`. kwargs["name"] = None # invalid because `year` is required. del kwargs["year"] # invalid. kwargs["extra"] = 1 ``` + What about compatibility checks for functions? *Rule*: When checking compatibility for functions that use `**Kwargs`, we will have to unpack the TypedDict fields and treat them as we treat explicit keyword-only parameters. ``` class MovieBase(TypedDict): name: str class Movie(MovieBase, total=False): year: int def foo(**kwargs: Movie) -> None: ... # foo must be treated as: def foo(*, name: str, year: int = ...) -> None: ... ``` If we didn't do the above, we would have unintuitive behavior because of the compatibility rules for TypedDict.
This is not a consideration when checking compatibility of functions with `**kwargs`, as far as I can see. ``` class ExpectsOptionalInt(Protocol): def foo(self, *, x: Optional[int]) -> None: ... class ExpectsInt(Protocol): def foo(self, *, x: int) -> None: ... def foo() -> None: x: ExpectsOptionalInt # valid y: ExpectsInt = x ``` However, if naively used the compatibility check that the TypedDict `Kwarg1` is compatible with `Kwarg2`, then we would get the following: ``` class OptionalIntKwargs(TypedDict): x: Optional[int] class IntKwargs(TypedDict): x: int class ExpectsOptionalInt(Protocol): def foo(self, **OptionalIntKwargs) -> None: ... class ExpectsInt(Protocol): def foo(self, **IntKwargs) -> None: ... def foo() -> None: x: ExpectsOptionalInt # We would consider this invalid! y: ExpectsInt = x # That is because we would check if `IntKwargs` was compatible with `OptionalIntKwargs`. # It is not compatible because of the TypedDict compatibility rules. # i.e., x: IntKwargs y: OptionalIntKwargs = x # invalid ``` Another example: a function with a non-required keyword-only parameter is compatible with a function that has a required keyword-only parameter. ``` class NonRequired(Protocol): def foo(self, *, x: int = ...) -> None: ... class Required(Protocol): def foo(self, *, x: int) -> None: ... def foo() -> None: x: NonRequired # valid y: Required = x ``` That wouldn't be the case if we tried to check if a TypedDict with a required field was compatible with a TypedDict having a non-required field. The proposed rule above answers for the other issues raised in the TypedDict PEP. + What if the type supplied to `**Kwargs` is a Union? Eric had raised this question. For the time being, we can consider only concrete TypedDicts. (I imagine unpacking a Union of TypedDicts would give us multiple overloads, but I haven't thought about complications like the ordering.) On Sun, Feb 7, 2021 at 5:43 PM S Pradeep Kumar <gohanpra@gmail.com> wrote:
-- S Pradeep Kumar

PEP 484 says that a type argument on a **kwargs parameter annotates the _value_ type of the dictionary. In other words, `**kwargs: X` indicates that kwargs is of type `Dict[str, X]`. If X is a TypedDict, then this would indicate that kwargs is a dictionary whose keys are strings and values are all typed dictionaries. I don't see any ambiguity here, but it's clearly not the behavior that you want. I think you're proposing to make an exception to the PEP 484 rule specifically in the case that X evaluates to a TypedDict. I think that's an ugly inconsistency, and it sets a dangerous precedent. One could make the argument that `Dict` should be exempt from the normal PEP 484 rules here as well. It would mean that there's no way to specify the case where you intend for kwargs to be a dictionary whose values are all type dictionaries. It's also a change that would break backward compatibility, since the rules for PEP 484 are well established. It also opens up questions like what if X is a union that includes a TypedDict or multiple TypedDicts? For all of these reasons, I think this proposal is a no-go. If I understand your motivation correctly, you are designing an interface where you have (presumably a large number of) keyword parameters that are shared across many methods. Have you considered changing your interface such that you don't expose individual keyword parameters and instead expose a single parameter that accepts a TypedDict? I realize this would change the way callers invoke these methods (e.g. `foo(a=3, b=5) would need to be changed to `foo({"a": 1, "b": 5})`), but it would be type safe and would work with existing Python type checkers. -- Eric Traut Contributor to Pyright and Pylance Microsoft Corp.

I agree with Eric that `**kwargs: Foo` means that `kwargs` is a dictionary of Foo values, as per PEP 484. Changing that would be backward-incompatible, and special-casing it for TypedDict alone would be clumsy. However, it *is* useful to be able to specify the individual types of the keyword-only parameters we want to accept. *Proposal*: We allow typing `**kwargs: **KwargsTypedDict`. Required keyword-only parameters will be required fields of the TypedDict. Keyword-only parameters with default values will be non-required fields of the TypedDict. I wanted a real-world example to motivate this and thought of `json.loads` and co: ```python # Simplified and modified slightly from `json.loads` and `json.load`, which # share the same keyword-only parameters. # Before. def loads( s: Union[str, bytes], *, # No default value for this one. json_decoder: Type[JSONDecoder], # Has a default value. parse_int: Optional[Callable[[str], Any]] = ..., # And a bunch of other keyword-only parameters. ) -> JSON: ... def load( fp: SupportsRead[Union[str, bytes]], *, json_decoder: Type[JSONDecoder], parse_int: Optional[Callable[[str], Any]] = ..., ) -> JSON: ... class JSONDecoder: def __init__( self, *, parse_int: Optional[Callable[[str], Any]] = ..., ) -> None: ... ``` Note that `loads` and `load` internally just construct a `JSONDecoder` by passing on their kwargs to the given `json_decoder`. Clearly, these functions share the same keyword-only parameters. ```python # These keyword-only parameters had default values. So, they are non-required # fields in the TypedDict. class JSONDecoderKwargs(TypedDict, total=False): parse_int: Callable[[str], Any] # And a bunch of others. # `json_decoder` was a required keyword-only parameter (in my example above). # So, it goes in a total TypedDict. # We also inherit the other fields (preserving their non-requiredness). class LoadKwargs(JSONDecoderKwargs, total=True): json_decoder: Type[JSONDecoder] # After. def loads( s: Union[str, bytes], **kwargs: **LoadKwargs, ) -> JSON: ... def load( fp: SupportsRead[Union[str, bytes]], **kwargs: **LoadKwargs, ) -> JSON: ... class JSONDecoder: def __init__( self, **kwargs: **JSONDecoderKwargs, ) -> None: ... # valid loads(s, json_decoder=MyJsonDecoder) loads(s, json_decoder=MyJsonDecoder, parse_int=my_parse_int) # invalid: missing keyword-only parameter `class_info`. loads(s) # invalid: type mismatch loads(s, json_decoder=1) # invalid: unexpected argument `foo`. loads(s, json_decoder=MyJsonDecoder, foo=2) ``` Other such cases off the top of my head include + `subprocess.run`, `Popen.__init__`, and friends. + `sort` and `sorted`, which accept the same keyword-only parameters Things to consider: 1. This is backward-compatible because it preserves the `**kwargs: int` behavior of PEP 484. 2. `**kwargs: **KwargsTypedDict` requires changes to the parser. If that's not worth it, we could settle for something like `**kwargs: UnpackTypedDict[KwargsTypedDict]`. This is analogous to PEP 646 making `Unpack[Ts]` a synonym for `*Ts` until syntax support lands. 3. We may also want to allow arbitrary keyword parameters beyond the ones specifically named. That is, how would we type the following using the above TypedDict proposal? ``` def foo(*, required: int, non_required: str=..., **kwargs: int) -> None: ... # valid. foo(required=1) # valid. foo(required=1, extra=7) # invalid: expected int, got str. foo(required=1, extra="wrong type") ``` One option is to simply require users to type out the named keyword-only parameters by hand (as done above for `foo`). This wouldn't allow multiple such functions to share the same keyword-only parameters, but would not require any other changes. A more long-term option is to allow open-ended TypedDicts - ones that allows arbitrary fields other than the named fields. I believe there was some discussion about this earlier, but there was no resulting PEP there: https://mail.python.org/archives/list/typing-sig@python.org/thread/66RITIHDQ.... This might be impractical to wait for. Yet another option is to always allow arbitrary kwargs beyond the fields in the TypedDict. I'm against this because it won't let us specify that we want a finite set of keyword-only parameters, like in the `json.loads` example above. In any case, this is a backward-compatible feature that we can defer for this discussion. On Sun, Feb 7, 2021 at 3:08 PM Eric Traut <eric@traut.com> wrote:
-- S Pradeep Kumar

Some other considerations: + Using `kwargs` within the function body: Within the function body, `kwargs` is treated as having type `Kwargs`. ``` class MovieKwargs(TypedDict): name: str year: int def foo(**kwargs: **MovieKwargs) -> None: name: str = kwargs["name"] year: int = kwargs["year"] # => MovieKwargs reveal_type(kwargs) # invalid because `name` is `str`. kwargs["name"] = None # invalid because `year` is required. del kwargs["year"] # invalid. kwargs["extra"] = 1 ``` + What about compatibility checks for functions? *Rule*: When checking compatibility for functions that use `**Kwargs`, we will have to unpack the TypedDict fields and treat them as we treat explicit keyword-only parameters. ``` class MovieBase(TypedDict): name: str class Movie(MovieBase, total=False): year: int def foo(**kwargs: Movie) -> None: ... # foo must be treated as: def foo(*, name: str, year: int = ...) -> None: ... ``` If we didn't do the above, we would have unintuitive behavior because of the compatibility rules for TypedDict.
This is not a consideration when checking compatibility of functions with `**kwargs`, as far as I can see. ``` class ExpectsOptionalInt(Protocol): def foo(self, *, x: Optional[int]) -> None: ... class ExpectsInt(Protocol): def foo(self, *, x: int) -> None: ... def foo() -> None: x: ExpectsOptionalInt # valid y: ExpectsInt = x ``` However, if naively used the compatibility check that the TypedDict `Kwarg1` is compatible with `Kwarg2`, then we would get the following: ``` class OptionalIntKwargs(TypedDict): x: Optional[int] class IntKwargs(TypedDict): x: int class ExpectsOptionalInt(Protocol): def foo(self, **OptionalIntKwargs) -> None: ... class ExpectsInt(Protocol): def foo(self, **IntKwargs) -> None: ... def foo() -> None: x: ExpectsOptionalInt # We would consider this invalid! y: ExpectsInt = x # That is because we would check if `IntKwargs` was compatible with `OptionalIntKwargs`. # It is not compatible because of the TypedDict compatibility rules. # i.e., x: IntKwargs y: OptionalIntKwargs = x # invalid ``` Another example: a function with a non-required keyword-only parameter is compatible with a function that has a required keyword-only parameter. ``` class NonRequired(Protocol): def foo(self, *, x: int = ...) -> None: ... class Required(Protocol): def foo(self, *, x: int) -> None: ... def foo() -> None: x: NonRequired # valid y: Required = x ``` That wouldn't be the case if we tried to check if a TypedDict with a required field was compatible with a TypedDict having a non-required field. The proposed rule above answers for the other issues raised in the TypedDict PEP. + What if the type supplied to `**Kwargs` is a Union? Eric had raised this question. For the time being, we can consider only concrete TypedDicts. (I imagine unpacking a Union of TypedDicts would give us multiple overloads, but I haven't thought about complications like the ordering.) On Sun, Feb 7, 2021 at 5:43 PM S Pradeep Kumar <gohanpra@gmail.com> wrote:
-- S Pradeep Kumar
participants (3)
-
Eric Traut
-
Paul Bryan
-
S Pradeep Kumar