Any way to express that a dict is expected to have a key(s) but don't care about other keys?
Looking at TypedDict it seems that it does not like the idea of superfluous keys existing in a dict. In my case I have a REST call returning JSON for an error and all I'm guaranteed is to get an "error" key. But there may be extra keys that provide more details about the specific error. Unfortunately I don't know the exhaustive list of keys so I can't use a TypedDict with `totality == False` to try to be complete. I can't think of any way to somehow say "require/expect this one key, but other keys are OK". Am I missing anything that would let me do this?
When you try the TypedDict, what error do you get? (I think totality should be true BTW.) On Wed, Mar 4, 2020 at 16:12 Brett Cannon <brett@python.org> wrote:
Looking at TypedDict it seems that it does not like the idea of superfluous keys existing in a dict. In my case I have a REST call returning JSON for an error and all I'm guaranteed is to get an "error" key. But there may be extra keys that provide more details about the specific error. Unfortunately I don't know the exhaustive list of keys so I can't use a TypedDict with `totality == False` to try to be complete.
I can't think of any way to somehow say "require/expect this one key, but other keys are OK". Am I missing anything that would let me do this? _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/
-- --Guido (mobile)
Quick proof-of-concept code: from typing import TypedDict class ErrorOnly(TypedDict): error: str def get_error() -> ErrorOnly: return {"error": "something bad happened", "details": "some stuff"} def process_error(err: ErrorOnly) -> str: return err["error"] process_error(get_error()) Result was an error on get_error(): A.py:7: error: Extra key 'details' for TypedDict "ErrorOnly" Found 1 error in 1 file (checked 1 source file) On Wed, Mar 4, 2020 at 4:14 PM Guido van Rossum <guido@python.org> wrote:
When you try the TypedDict, what error do you get? (I think totality should be true BTW.)
On Wed, Mar 4, 2020 at 16:12 Brett Cannon <brett@python.org> wrote:
Looking at TypedDict it seems that it does not like the idea of superfluous keys existing in a dict. In my case I have a REST call returning JSON for an error and all I'm guaranteed is to get an "error" key. But there may be extra keys that provide more details about the specific error. Unfortunately I don't know the exhaustive list of keys so I can't use a TypedDict with `totality == False` to try to be complete.
I can't think of any way to somehow say "require/expect this one key, but other keys are OK". Am I missing anything that would let me do this? _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/
-- --Guido (mobile)
OK, that's unfortunate. There's a good reason for this very strict behavior of TypedDict, although I always forget what it is (maybe careful reading of PEP 589 would reveal it -- perhaps it's just due to mutability / variance, i.e. the same reason you can't pass a subclass of List to a function taking a List). On Wed, Mar 4, 2020 at 5:04 PM Brett Cannon <brett@python.org> wrote:
Quick proof-of-concept code:
from typing import TypedDict
class ErrorOnly(TypedDict): error: str
def get_error() -> ErrorOnly: return {"error": "something bad happened", "details": "some stuff"}
def process_error(err: ErrorOnly) -> str: return err["error"]
process_error(get_error())
Result was an error on get_error():
A.py:7: error: Extra key 'details' for TypedDict "ErrorOnly" Found 1 error in 1 file (checked 1 source file)
On Wed, Mar 4, 2020 at 4:14 PM Guido van Rossum <guido@python.org> wrote:
When you try the TypedDict, what error do you get? (I think totality should be true BTW.)
On Wed, Mar 4, 2020 at 16:12 Brett Cannon <brett@python.org> wrote:
Looking at TypedDict it seems that it does not like the idea of superfluous keys existing in a dict. In my case I have a REST call returning JSON for an error and all I'm guaranteed is to get an "error" key. But there may be extra keys that provide more details about the specific error. Unfortunately I don't know the exhaustive list of keys so I can't use a TypedDict with `totality == False` to try to be complete.
I can't think of any way to somehow say "require/expect this one key, but other keys are OK". Am I missing anything that would let me do this? _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/
-- --Guido (mobile)
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
One workaround is to nest the unknowable part of the dictionary: ``` class ErrorOnly(TypedDict): error: str data: Dict[str, Any] ``` PEP 589 says: The use of a key that is not known to exist should be reported as an error, even if this wouldn't necessarily generate a runtime type error. These are often mistakes, and these may insert values with an invalid type if structural subtyping hides the types of certain items. For example, d['x'] = 1 should generate a type check error if 'x' is not a valid key for d (which is assumed to be a TypedDict type). The disallowing typos / dumb mistakes makes sense. I guess the point about structural subtyping is if you had: ``` class SecretError(ErrorOnly): details: int ``` You wouldn't be able to prevent: ``` def f(e: ErrorOnly) -> None: e['details'] = 'something bad happened here' f(secret_error) ``` I don't think either of those reasons would apply if we had an immutable TypedMapping, though. On Wed, 4 Mar 2020 at 17:45, Guido van Rossum <guido@python.org> wrote:
OK, that's unfortunate. There's a good reason for this very strict behavior of TypedDict, although I always forget what it is (maybe careful reading of PEP 589 would reveal it -- perhaps it's just due to mutability / variance, i.e. the same reason you can't pass a subclass of List to a function taking a List).
On Wed, Mar 4, 2020 at 5:04 PM Brett Cannon <brett@python.org> wrote:
Quick proof-of-concept code:
from typing import TypedDict
class ErrorOnly(TypedDict): error: str
def get_error() -> ErrorOnly: return {"error": "something bad happened", "details": "some stuff"}
def process_error(err: ErrorOnly) -> str: return err["error"]
process_error(get_error())
Result was an error on get_error():
A.py:7: error: Extra key 'details' for TypedDict "ErrorOnly" Found 1 error in 1 file (checked 1 source file)
On Wed, Mar 4, 2020 at 4:14 PM Guido van Rossum <guido@python.org> wrote:
When you try the TypedDict, what error do you get? (I think totality should be true BTW.)
On Wed, Mar 4, 2020 at 16:12 Brett Cannon <brett@python.org> wrote:
Looking at TypedDict it seems that it does not like the idea of superfluous keys existing in a dict. In my case I have a REST call returning JSON for an error and all I'm guaranteed is to get an "error" key. But there may be extra keys that provide more details about the specific error. Unfortunately I don't know the exhaustive list of keys so I can't use a TypedDict with `totality == False` to try to be complete.
I can't think of any way to somehow say "require/expect this one key, but other keys are OK". Am I missing anything that would let me do this? _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/
-- --Guido (mobile)
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/
בתאריך יום ה׳, 5 במרץ 2020, 5:40, מאת Shantanu Jain <hauntsaninja@gmail.com
:
One workaround is to nest the unknowable part of the dictionary: ``` class ErrorOnly(TypedDict): error: str data: Dict[str, Any] ```
PEP 589 says:
The use of a key that is not known to exist should be reported as an error, even if this wouldn't necessarily generate a runtime type error. These are often mistakes, and these may insert values with an invalid type if structural subtyping hides the types of certain items. For example, d['x'] = 1 should generate a type check error if 'x' is not a valid key for d (which is assumed to be a TypedDict type).
The disallowing typos / dumb mistakes makes sense. I guess the point about structural subtyping is if you had: ``` class SecretError(ErrorOnly): details: int ``` You wouldn't be able to prevent: ``` def f(e: ErrorOnly) -> None: e['details'] = 'something bad happened here'
f(secret_error) ```
That can be solved by making it explicit somehow, an "open dict". class Error(TypedDict): error: str ... # ellipsis Or class Error(TypedDict, open=True): error: str Or some other syntax. Then forbid inheritance from such open dicts (or allow inheriting types to only have `Any` values, and forbid removal of values). Elazar
On Wed, 4 Mar 2020 at 17:45, Guido van Rossum <guido@python.org> wrote:
OK, that's unfortunate. There's a good reason for this very strict behavior of TypedDict, although I always forget what it is (maybe careful reading of PEP 589 would reveal it -- perhaps it's just due to mutability / variance, i.e. the same reason you can't pass a subclass of List to a function taking a List).
On Wed, Mar 4, 2020 at 5:04 PM Brett Cannon <brett@python.org> wrote:
Quick proof-of-concept code:
from typing import TypedDict
class ErrorOnly(TypedDict): error: str
def get_error() -> ErrorOnly: return {"error": "something bad happened", "details": "some stuff"}
def process_error(err: ErrorOnly) -> str: return err["error"]
process_error(get_error())
Result was an error on get_error():
A.py:7: error: Extra key 'details' for TypedDict "ErrorOnly" Found 1 error in 1 file (checked 1 source file)
On Wed, Mar 4, 2020 at 4:14 PM Guido van Rossum <guido@python.org> wrote:
When you try the TypedDict, what error do you get? (I think totality should be true BTW.)
On Wed, Mar 4, 2020 at 16:12 Brett Cannon <brett@python.org> wrote:
Looking at TypedDict it seems that it does not like the idea of superfluous keys existing in a dict. In my case I have a REST call returning JSON for an error and all I'm guaranteed is to get an "error" key. But there may be extra keys that provide more details about the specific error. Unfortunately I don't know the exhaustive list of keys so I can't use a TypedDict with `totality == False` to try to be complete.
I can't think of any way to somehow say "require/expect this one key, but other keys are OK". Am I missing anything that would let me do this? _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/
-- --Guido (mobile)
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/
_______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/
On Thu, Mar 5, 2020 at 2:13 AM Elazar <elazarg@gmail.com> wrote:
בתאריך יום ה׳, 5 במרץ 2020, 5:40, מאת Shantanu Jain < hauntsaninja@gmail.com>:
One workaround is to nest the unknowable part of the dictionary: ``` class ErrorOnly(TypedDict): error: str data: Dict[str, Any] ```
PEP 589 says:
The use of a key that is not known to exist should be reported as an error, even if this wouldn't necessarily generate a runtime type error. These are often mistakes, and these may insert values with an invalid type if structural subtyping hides the types of certain items. For example, d['x'] = 1 should generate a type check error if 'x' is not a valid key for d (which is assumed to be a TypedDict type).
The disallowing typos / dumb mistakes makes sense. I guess the point about structural subtyping is if you had: ``` class SecretError(ErrorOnly): details: int ``` You wouldn't be able to prevent: ``` def f(e: ErrorOnly) -> None: e['details'] = 'something bad happened here'
f(secret_error) ```
That can be solved by making it explicit somehow, an "open dict".
class Error(TypedDict): error: str ... # ellipsis
Or
class Error(TypedDict, open=True): error: str
Or some other syntax. Then forbid inheritance from such open dicts (or allow inheriting types to only have `Any` values, and forbid removal of values).
Yeah, I was looking for something like this "open dict" idea. Basically I want the type system to express that I guarantee some key(s), but others may exist and you can use them but you need to do an appropriate check before trying to access them. -Brett
Elazar
On Wed, 4 Mar 2020 at 17:45, Guido van Rossum <guido@python.org> wrote:
OK, that's unfortunate. There's a good reason for this very strict behavior of TypedDict, although I always forget what it is (maybe careful reading of PEP 589 would reveal it -- perhaps it's just due to mutability / variance, i.e. the same reason you can't pass a subclass of List to a function taking a List).
On Wed, Mar 4, 2020 at 5:04 PM Brett Cannon <brett@python.org> wrote:
Quick proof-of-concept code:
from typing import TypedDict
class ErrorOnly(TypedDict): error: str
def get_error() -> ErrorOnly: return {"error": "something bad happened", "details": "some stuff"}
def process_error(err: ErrorOnly) -> str: return err["error"]
process_error(get_error())
Result was an error on get_error():
A.py:7: error: Extra key 'details' for TypedDict "ErrorOnly" Found 1 error in 1 file (checked 1 source file)
On Wed, Mar 4, 2020 at 4:14 PM Guido van Rossum <guido@python.org> wrote:
When you try the TypedDict, what error do you get? (I think totality should be true BTW.)
On Wed, Mar 4, 2020 at 16:12 Brett Cannon <brett@python.org> wrote:
Looking at TypedDict it seems that it does not like the idea of superfluous keys existing in a dict. In my case I have a REST call returning JSON for an error and all I'm guaranteed is to get an "error" key. But there may be extra keys that provide more details about the specific error. Unfortunately I don't know the exhaustive list of keys so I can't use a TypedDict with `totality == False` to try to be complete.
I can't think of any way to somehow say "require/expect this one key, but other keys are OK". Am I missing anything that would let me do this? _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/
-- --Guido (mobile)
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/
_______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/
Similarly, but perhaps not _exactly_ the same issue, something like this could also open up the ability to assign a variable with a "smaller" `TypedDict` type to one that's "larger". Not being able to do this has caused me to bang my head against the wall a lot in the past, and I even implemented a special method + mypy plugin to allow me to do so, which feels very clunky. e.g. ``` class APIData(TypedDict, open=True): id: int name: str class ServiceData(TypedDict, total=False): id: int name: str title: str ... api_data: APIData = {'id': 5, 'name': 'Jenn'} service_data: ServiceData = api_data # This errors with the existing API, but with `open=True` or a similar option, maybe not. ``` On Thu, Mar 5, 2020 at 1:22 PM Brett Cannon <brett@python.org> wrote:
On Thu, Mar 5, 2020 at 2:13 AM Elazar <elazarg@gmail.com> wrote:
בתאריך יום ה׳, 5 במרץ 2020, 5:40, מאת Shantanu Jain < hauntsaninja@gmail.com>:
One workaround is to nest the unknowable part of the dictionary: ``` class ErrorOnly(TypedDict): error: str data: Dict[str, Any] ```
PEP 589 says:
The use of a key that is not known to exist should be reported as an error, even if this wouldn't necessarily generate a runtime type error. These are often mistakes, and these may insert values with an invalid type if structural subtyping hides the types of certain items. For example, d['x'] = 1 should generate a type check error if 'x' is not a valid key for d (which is assumed to be a TypedDict type).
The disallowing typos / dumb mistakes makes sense. I guess the point about structural subtyping is if you had: ``` class SecretError(ErrorOnly): details: int ``` You wouldn't be able to prevent: ``` def f(e: ErrorOnly) -> None: e['details'] = 'something bad happened here'
f(secret_error) ```
That can be solved by making it explicit somehow, an "open dict".
class Error(TypedDict): error: str ... # ellipsis
Or
class Error(TypedDict, open=True): error: str
Or some other syntax. Then forbid inheritance from such open dicts (or allow inheriting types to only have `Any` values, and forbid removal of values).
Yeah, I was looking for something like this "open dict" idea. Basically I want the type system to express that I guarantee some key(s), but others may exist and you can use them but you need to do an appropriate check before trying to access them.
-Brett
Elazar
On Wed, 4 Mar 2020 at 17:45, Guido van Rossum <guido@python.org> wrote:
OK, that's unfortunate. There's a good reason for this very strict behavior of TypedDict, although I always forget what it is (maybe careful reading of PEP 589 would reveal it -- perhaps it's just due to mutability / variance, i.e. the same reason you can't pass a subclass of List to a function taking a List).
On Wed, Mar 4, 2020 at 5:04 PM Brett Cannon <brett@python.org> wrote:
Quick proof-of-concept code:
from typing import TypedDict
class ErrorOnly(TypedDict): error: str
def get_error() -> ErrorOnly: return {"error": "something bad happened", "details": "some stuff" }
def process_error(err: ErrorOnly) -> str: return err["error"]
process_error(get_error())
Result was an error on get_error():
A.py:7: error: Extra key 'details' for TypedDict "ErrorOnly" Found 1 error in 1 file (checked 1 source file)
On Wed, Mar 4, 2020 at 4:14 PM Guido van Rossum <guido@python.org> wrote:
When you try the TypedDict, what error do you get? (I think totality should be true BTW.)
On Wed, Mar 4, 2020 at 16:12 Brett Cannon <brett@python.org> wrote:
> Looking at TypedDict it seems that it does not like the idea of > superfluous keys existing in a dict. In my case I have a REST call > returning JSON for an error and all I'm guaranteed is to get an "error" > key. But there may be extra keys that provide more details about the > specific error. Unfortunately I don't know the exhaustive list of keys so I > can't use a TypedDict with `totality == False` to try to be complete. > > I can't think of any way to somehow say "require/expect this one > key, but other keys are OK". Am I missing anything that would let me do > this? > _______________________________________________ > Typing-sig mailing list -- typing-sig@python.org > To unsubscribe send an email to typing-sig-leave@python.org > https://mail.python.org/mailman3/lists/typing-sig.python.org/ > -- --Guido (mobile)
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/
_______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/
_______________________________________________
Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/
Elazar wrote:
בתאריך יום ה׳, 5 במרץ 2020, 5:40, מאת Shantanu Jain <hauntsaninja@gmail.com
: One workaround is to nest the unknowable part of the dictionary: class ErrorOnly(TypedDict): error: str data: Dict[str, Any]
PEP 589 says: The use of a key that is not known to exist should be reported as an error, even if this wouldn't necessarily generate a runtime type error. These are often mistakes, and these may insert values with an invalid type if structural subtyping hides the types of certain items. For example, d['x'] = 1 should generate a type check error if 'x' is not a valid key for d (which is assumed to be a TypedDict type). The disallowing typos / dumb mistakes makes sense. I guess the point about structural subtyping is if you had: class SecretError(ErrorOnly): details: int
You wouldn't be able to prevent: def f(e: ErrorOnly) -> None: e['details'] = 'something bad happened here'
f(secret_error)
That can be solved by making it explicit somehow, an "open dict". class Error(TypedDict): error: str ... # ellipsis Or class Error(TypedDict, open=True): error: str
Or class Error(TypedDict, subset=True): error: str Either way a PEP is probably necessary as it will require type checkers to be updated. -Brett
Or some other syntax. Then forbid inheritance from such open dicts (or allow inheriting types to only have Any values, and forbid removal of values). Elazar
On Wed, 4 Mar 2020 at 17:45, Guido van Rossum guido@python.org wrote: OK, that's unfortunate. There's a good reason for this very strict behavior of TypedDict, although I always forget what it is (maybe careful reading of PEP 589 would reveal it -- perhaps it's just due to mutability / variance, i.e. the same reason you can't pass a subclass of List to a function taking a List). On Wed, Mar 4, 2020 at 5:04 PM Brett Cannon brett@python.org wrote: Quick proof-of-concept code: from typing import TypedDict class ErrorOnly(TypedDict): error: str def get_error() -> ErrorOnly: return {"error": "something bad happened", "details": "some stuff"} def process_error(err: ErrorOnly) -> str: return err["error"] process_error(get_error()) Result was an error on get_error(): A.py:7: error: Extra key 'details' for TypedDict "ErrorOnly" Found 1 error in 1 file (checked 1 source file) On Wed, Mar 4, 2020 at 4:14 PM Guido van Rossum guido@python.org wrote: When you try the TypedDict, what error do you get? (I think totality should be true BTW.) On Wed, Mar 4, 2020 at 16:12 Brett Cannon brett@python.org wrote: Looking at TypedDict it seems that it does not like the idea of superfluous keys existing in a dict. In my case I have a REST call returning JSON for an error and all I'm guaranteed is to get an "error" key. But there may be extra keys that provide more details about the specific error. Unfortunately I don't know the exhaustive list of keys so I can't use a TypedDict with totality == False to try to be complete. I can't think of any way to somehow say "require/expect this one key, but other keys are OK". Am I missing anything that would let me do this?
Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ -- --Guido (mobile) -- --Guido van Rossum (python.org/~guido) Pronouns: he/him **(why is my pronoun here?) http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...
Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/
Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/
It looks to me, as this was the end of the discussion about a way to allow extra fields to a `TypedDict`. Is this correct? (I am new to the list and just checked the history, but I might have missed something.) If no, there is no need to read further, please just point me to the current state of the discussion. I would like to make a point for introducing the option to have extra fields in a `TypedDict` definition. I would also like to make a proposal of how to do it. I think this feature totally makes sense and it would help me in my Python programming. I would even (try to) write a PEP for this. I am new to this kind of discussion, and I have read a hint somewhere, that before writing a PEP, one should better discuss this here on the list. I would be happy to get feedback and direction on writing a PEP: Does it make sense to write a PEP for this? Or are there some goods reasons this would probably be a waste of time? Thanks! Jonathan # Arguments in favor of optionally allowing extra fields ## "We are all consenting adults here". When dealing with external data sources (JSON APIs and mongo db have already been mentioned in the thread), superfluous keys might come, and the programmer might just want to ignore them. The argument for keeping the strict keys was to prevent the hiding of typos. This is a good argument for having key-strictness as the standard behavior of TypedDict. But it is not a sufficient argument for not having a different optional behavior. Imho, the programmer should have the choice to declare whether they want to face that risk in order to trade it in for some benefit they might prefer. ## Duck typing Giving `TypedDict` the power to optionally allow extra keys is also in the spirit of duck typing. Static duck typing has been introduced with `typing.Protocol` (PEP-544) Allowing static duck typing for dictionaries is in the same spirit. ## Typing **kwargs There is a discussion to introduce typing for `**kwargs` (https://github.com/python/mypy/issues/4441). Most naturally, `**kwargs` would be specified using a `TypedDict` definition. As `**kwargs` are by design unspecified, this would need a way to define a `TypedDict` which allows for the existence of unspecified keys. # Proposal I would suggest to introduce one additional optional boolean parameter `extra_fields`, defaulting to `False` to the constructor of `TypedDict`. The value should be stored in the `__extra_fields__` attribute of the TypedDict. There should be _no_ specification for the types of the extra keys and values. First, if this gets pressing, it can be left for a later iteration without harm. Second, from my point of view, it looks like specifying the types of keys and values is somewhat against the idea of duck typing -- if you don't even want to specify the extra keys, that means your logic should not rely on them. This second argument is not an absolute argument against specifying, but more a point of mitigating the harm of not having these specifications, which strengthens the first point. The name should be `extra_fields`, or `additional_fields`. `extra_keys` or `additional_keys` is a worse name, because this would rather indicate a type definition on the extra keys. If later there should be type specifications for keys and values of the extra fields (because the arguments from the paragraph above are deemed invalid or insufficient), `extra_keys` and `extra_values` could be used for defining these.
El jue, 10 mar 2022 a las 7:34, <j.scholbach@posteo.de> escribió:
It looks to me, as this was the end of the discussion about a way to allow extra fields to a `TypedDict`. Is this correct? (I am new to the list and just checked the history, but I might have missed something.) If no, there is no need to read further, please just point me to the current state of the discussion.
I would like to make a point for introducing the option to have extra fields in a `TypedDict` definition. I would also like to make a proposal of how to do it. I think this feature totally makes sense and it would help me in my Python programming. I would even (try to) write a PEP for this. I am new to this kind of discussion, and I have read a hint somewhere, that before writing a PEP, one should better discuss this here on the list.
Welcome to this list! Introducing extra_fields would be an interesting idea, but it should be kept separate from the idea of supporting TypedDict for kwargs, so it would be better to discuss it in a separate thread. If you're interested in pursuing the idea, I think you're on the right track: first get initial feedback on this list, then write a PEP and see it through the process.
I would be happy to get feedback and direction on writing a PEP: Does it make sense to write a PEP for this? Or are there some goods reasons this would probably be a waste of time?
Thanks!
Jonathan
# Arguments in favor of optionally allowing extra fields
## "We are all consenting adults here".
When dealing with external data sources (JSON APIs and mongo db have already been mentioned in the thread), superfluous keys might come, and the programmer might just want to ignore them. The argument for keeping the strict keys was to prevent the hiding of typos. This is a good argument for having key-strictness as the standard behavior of TypedDict. But it is not a sufficient argument for not having a different optional behavior. Imho, the programmer should have the choice to declare whether they want to face that risk in order to trade it in for some benefit they might prefer.
## Duck typing
Giving `TypedDict` the power to optionally allow extra keys is also in the spirit of duck typing. Static duck typing has been introduced with `typing.Protocol` (PEP-544) Allowing static duck typing for dictionaries is in the same spirit.
## Typing **kwargs
There is a discussion to introduce typing for `**kwargs` (https://github.com/python/mypy/issues/4441). Most naturally, `**kwargs` would be specified using a `TypedDict` definition. As `**kwargs` are by design unspecified, this would need a way to define a `TypedDict` which allows for the existence of unspecified keys.
# Proposal
I would suggest to introduce one additional optional boolean parameter `extra_fields`, defaulting to `False` to the constructor of `TypedDict`. The value should be stored in the `__extra_fields__` attribute of the TypedDict.
There should be _no_ specification for the types of the extra keys and values. First, if this gets pressing, it can be left for a later iteration without harm. Second, from my point of view, it looks like specifying the types of keys and values is somewhat against the idea of duck typing -- if you don't even want to specify the extra keys, that means your logic should not rely on them. This second argument is not an absolute argument against specifying, but more a point of mitigating the harm of not having these specifications, which strengthens the first point.
The name should be `extra_fields`, or `additional_fields`. `extra_keys` or `additional_keys` is a worse name, because this would rather indicate a type definition on the extra keys. If later there should be type specifications for keys and values of the extra fields (because the arguments from the paragraph above are deemed invalid or insufficient), `extra_keys` and `extra_values` could be used for defining these. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: jelle.zijlstra@gmail.com
I think this is a reasonable idea. Some things to consider... You proposed that there should be _no_ specification of the types of the extra fields, which means they would implicitly have the type `Any`. It's worth considering whether it would be useful in a subset of cases to specify (and enforce) the type of the extra fields. And if so, would it be useful for that type to be generic? A rule would need to be established for a TypedDict that derives from another TypedDict. If the base class specifies `extra_fields=True` but the derived class does not, is that an error? Does the derived class inherit the `extra_fields` property? What if both specify `extra_fields` but they contradict each other? Here's an alternative formulation that addresses the above issues. We could support a special dunder field that type checkers recognize and treat specially. Let's say we call it `__extra__`. Then the TypedDict definition would look like the following: ```python class A(TypedDict): x: int __extra__: str | None ``` The normal rules for variable inheritance would apply in a natural way. ```python class B(A): pass # B supports __extra__ because it inherits it from A class C(A): __extra__: str # This is OK because "str" is a subtype of "str | None" class D(A): __extra__: int # This is an error because "int" isn't a subtype of "str | None" ``` There has also been serious discussion of adding generic support for `TypedDict`, and this alternate formulation would work well for generics. ```python class A(TypedDict, Generic[T]): foo: str __extra__: T class B(A[int]): bar: str ``` -Eric -- Eric Traut Contributor to Pyright & Pylance Microsoft
I have started a draft for a PEP for this. It is my first draft of a PEP. Any feedback and input on it is very much appreciated: https://github.com/jonathan-scholbach/peps/blob/main/pep-9999.rst I have written the draft in ignorance of your message, Eric Traut. I answer here, and point to the draft, where it makes sense. A) Typing of extra fields: I have taken this into consideration with some detailed reasoning on the draft. The baseline is that I find it hard to come up with a use case for that. The problems you line out for the inheritance of the __extra__ attribute (if it holds the value type constraint) could also be seen as a strengthener for the tendency to keep it simple (just a boolean flag). But if you could name a good use case for value type constraints on the extra fields, that would increase my understanding of the implications of this a lot. B) Inheritance behavior of `extra` (I think that name is better than `extra_fields`, the argument for this is on the draft, too): The idea is to conceptualize `extra` in close analogy to `total` (find the reasoning on the draft). `total` already showed the problem that inheritance could lead to slight inconsistency (https://bugs.python.org/issue38834). I agree there should be an `__extra__` dunder attribute on the TypedDict, which just behaves like a normal attribute under inheritance. But I still think, it would be sweet syntax to have `extra` as a parameter of the constructor instead of writing the dunder parameter in the dictionary definition. In particular, it is relevant that ```python class A(TypedDict): foo: str __extra__: str ``` is ambiguous: it could mean that a key `"__extra__"` with value type `str` would be enforced on the dictionary. But I agree with you that is important to handle the fact that ```python class A(TypedDict, extra=True): x: int class B(TypedDict, extra=False): pass ``` leads to an inconsistency. The problem with this is that the class hierarchy would not be aligned with the type hierarchy any more. That is unexpected and a smell. For `total` this problem does not occur, as in something like ```python class A(TypedDict, total=True): a: int class B(A, total=False): pass B.__required_keys__ # frozenset({'a'}) ``` the child class's `total` specification has effect only on the keys that are added on the child class. The solution for `extra` would be to allow the inheriting class only to flip the value of `extra` from False to True when changing it. I think, that makes sense. I actually think, analogue behavior of `total` would also make sense, because I consider ```python class A(TypedDict, total=True): a: int b: str class B(A, total=False): a: int B.__required_keys__ # frozenset({'a', 'b'}) ``` a gotcha. But this is probably out of scope here. C) Generics: This is a good point. I think it makes sense to discuss this, once A) is settled. I just don't know when a point could be considered "settled" :) I am sure this particular question is not settled yet (but far away from this), but any feedback on how these discussions are lead here, is very much appreciated.
I have started a draft for a PEP for this. It is my first draft of a PEP. Any feedback and input on it is very much appreciated: https://github.com/jonathan-scholbach/peps/blob/main/pep-9999.rst I have written the draft in ignorance of your message, Eric Traut. I answer here, and point to the draft, where it makes sense.
A) Typing of extra fields: I have taken this into consideration with some detailed reasoning on the draft. The baseline is that I find it hard to come up with a use case for that. The problems you line out for the inheritance of the __extra__ attribute (if it holds the value type constraint) could also be seen as a strengthener for the tendency to keep it simple (just a boolean flag). But if you could name a good use case for value type constraints on the extra fields, that would increase my understanding of the implications of this a lot.
I don't find this feature very compelling if there is no way to specify
El dom, 13 mar 2022 a las 12:51, <j.scholbach@posteo.de> escribió: the type of the extra fields. TypedDicts in general already allow extra keys to exist (because they support structural subtyping), so if we can't say what the type of the extra keys is, we really don't gain much from this new feature. Your proposed PEP doesn't say much about what specific operations are allowed on an extra=True TypedDict but not a regular TypedDict.
B) Inheritance behavior of `extra` (I think that name is better than `extra_fields`, the argument for this is on the draft, too): The idea is to conceptualize `extra` in close analogy to `total` (find the reasoning on the draft). `total` already showed the problem that inheritance could lead to slight inconsistency (https://bugs.python.org/issue38834). I agree there should be an `__extra__` dunder attribute on the TypedDict, which just behaves like a normal attribute under inheritance. But I still think, it would be sweet syntax to have `extra` as a parameter of the constructor instead of writing the dunder parameter in the dictionary definition. In particular, it is relevant that
```python class A(TypedDict): foo: str __extra__: str ```
is ambiguous: it could mean that a key `"__extra__"` with value type `str` would be enforced on the dictionary. But I agree with you that is important to handle the fact that
```python class A(TypedDict, extra=True): x: int
class B(TypedDict, extra=False): pass ```
leads to an inconsistency. The problem with this is that the class hierarchy would not be aligned with the type hierarchy any more. That is unexpected and a smell. For `total` this problem does not occur, as in something like
```python class A(TypedDict, total=True): a: int
class B(A, total=False): pass
B.__required_keys__ # frozenset({'a'}) ```
the child class's `total` specification has effect only on the keys that are added on the child class. The solution for `extra` would be to allow the inheriting class only to flip the value of `extra` from False to True when changing it. I think, that makes sense. I actually think, analogue behavior of `total` would also make sense, because I consider
```python class A(TypedDict, total=True): a: int b: str
class B(A, total=False): a: int
B.__required_keys__ # frozenset({'a', 'b'}) ```
a gotcha. But this is probably out of scope here.
C) Generics: This is a good point. I think it makes sense to discuss this, once A) is settled. I just don't know when a point could be considered "settled" :) I am sure this particular question is not settled yet (but far away from this), but any feedback on how these discussions are lead here, is very much appreciated. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: jelle.zijlstra@gmail.com
I agree with Jelle. If there's no way to specify the type of the extra fields, then I don't think we would gain value from this PEP. You mentioned that the use of `__extra__` would lead to an ambiguity. The purpose of a specification is to remove such ambiguities, so I don't think it would be ambiguous. It would preclude the use of a TypedDict with an actual key named `__extra__`. Dundered names are supposed to be reserved for stdlib functionality, so using such a name as a key in your TypedDict would be ill advised. I'm open to other suggestions for how to specify the type of the extra entries, but I can't think of any better suggestions. Specifying the type as a class declaration argument isn't a good choice. -Eric
El dom, 13 mar 2022 a las 13:35, Eric Traut (<eric@traut.com>) escribió:
I'm open to other suggestions for how to specify the type of the extra entries, but I can't think of any better suggestions. Specifying the type as a class declaration argument isn't a good choice.
Why not? I don't see a problem with something like this: class TD(TypedDict, extra=str): foo: int bar: float Is your concern about TypeVar scope if we also added support for Generic TypedDicts?
-Eric _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: jelle.zijlstra@gmail.com
Any time we allow types to be specified outside of type annotations, it tends to create problems. The interpreter must evaluate them immediately even if `from __future__ import annotations` is used. TypeVar scoping is probably the biggest problem in this particular case. Depending on the architecture of the type checker, there can also be issues with circular type references for types passed as arguments to the class declaration. -Eric
You mentioned that the use of `__extra__` would lead to an ambiguity. The purpose of a specification is to remove such ambiguities, so I don't think it would be ambiguous. It would preclude the use of a TypedDict with an actual key named `__extra__`. Dundered names are supposed to be reserved for stdlib functionality, so using such a name as a key in your TypedDict would be ill advised.
Of course the specification would need to find a way to resolve the ambiguity. Your idea was also my first idea - exclude the possibility of an "__extra__" key. I think, that would be unexpected and and strange restriction from the user's perspective. "__extra__" is a valid dictionary key. While it is common to reserve dunder names for object attributes, this is not the case for dictionary keys. Not having the possibility to use every possible dictionary key on a TypedDict specification for reasons of internal implementation looks like bad design to me.
Any time we allow types to be specified outside of type annotations, it tends to create problems. The interpreter must evaluate them immediately even if `from __future__ import annotations` is used.
How is this solved for `total`? Why would a similar solution as for `total` not be acceptable for `extra`?
You can use sourcegraph to look for usages of __extra__, https://sourcegraph.com/search?q=context:global+__extra__:+lang:Python+fork:no+archived:no&patternType=literal. It only includes public code, but that still covers a lot of code. Almost all usages of __extra__ come from a few in typing_extensions and are not a TypedDict context. So current conflict amount is near 0. If it's not being used as a key today across many millions of lines of code then it's pretty safe. So I lean towards __extra__ solution as simple to describe and handles annotations as expected. In theory one way to handle conflicts would be something like, class A(TypedDict, extra_key="__custom_extra__"): __custom_extra__: type_annotation but I think need to override extra_key will be rare enough that if it's an issue one day we can revisit it then. I recently had a similar discussion pop on a serialization library and what field name to use for serializing class type. After discussion I ended up not allowing field name to be customizable as the chance of conflict seemed too low and worst case it's backwards compatible to introduce extra_key in future.
Just to close off a couple of open questions:
TypedDicts in general already allow extra keys to exist (because they support structural subtyping)
The point here was that structural subtypes (that is, "child" TypedDicts) can specify extra keys, and this has implications for type checking.: Structural subtypes are subtypes, so when we type check code with a variable `x` of type `A` for some TypedDict `A`, it is already not correct to assume there are no extra keys. The value in `x` might actually have the schema specified by some sub-TypedDict `B` that adds additional fields, and therefore could include fields that weren't part of the schema of `A`.
How is this solved for `total`? Why would a similar solution as for `total` not be acceptable for `extra`?
The point about types outside of annotations was specific to if we pass a type in as `extra`, rather than a boolean. In that case, the evaluation model for the type could be complicated for a variety of reasons, it boils down to the fact that types are normal python expressions, but when they are used as annotations they get special handling (for example they are turned into strings when `from __future__ import annotations` is used). This special handling doesn't apply to types used in other syntactic places, such as in a class. This concern is specific to using a type; boolean flags like `total` (or `extra` as a boolean as in the draft PEP) are just fine.
I would like to continue the discussion we had about this in yesterday's meetup. (Thanks again to Pradeep for offering the opportunity to present the case in the meetup, and big thanks to Steven for clarifying the arguments and presenting the case. As a newcomer, I feel I am facing a very open, kind and supportive community.) 0. The problem diagnosis presented as a premise for suggested changes was that TypedDicts, the way they are currently defined, do not treat values in a transitive way: ```python class Parent(TypedDict): x: int class Child(Parent): y: int a: Parent = {"x": 0, "y": 0} # forbidden, due to PEP-589 b: Child = {"x": 0, "y": 0} # OK, because there is a: Parent = b # upcasting is allowed ``` Each of the last three lines makes sense in isolation. Combined, they reveal an oddity: Even though `a` has identical value in line 7 and line 9, it is allowed to assign the type `Parent` to it in line 9, but not in line 7. One could call this "intransitive", "incoherent" or even "inconsistent" behavior of the type-checking system. The name does not matter too much. The point is, there is a tension between allowing subtyping (effectively allowing extra fields on a type-compliant value) and forbidding extra fields on literals. 1. There was unresolved dissent whether lack of transitivity should be considered problematic. In particular, the idea was expressed that consistency is not a value per se. It was also expressed that the confusion caused by this oddity is subjective and part of a learning curve. If we don't believe this is a problem, any further discussion of taking action would be obsolete. I would like to try to establish consensus that this should be considered a problem. This is the argument: It is common to think of types as descriptions of sets of values. There are other definitions, but even PEP-438 ("The Theory to Type Hints") conceptualizes types as sets of values in order to explain the typing system. It is also common language in PEPs to speak of "the type of a certain value". When we have a non-transitive type this concept is broken: Then a type does not correspond to a set of values in an unambiguous way. Whether a literal value complies with the type is then not a question of the value and the type alone. The notion of "type" can still be defined in a consistent and well-defined way - for instance the type could be the definition of a set of statements. It is understandable, that writers of type checkers might conceptualize it this way. Even though this is a possible and well-defined notion of types, it is way less intuitive for users of the type checkers. It is true that it depends on perspective and experience what one finds confusing and that every confusion could be overcome by individual learning. But requiring the users to invest more effort into understanding the typing system has a price. But even more fundamentally, the benefit of the typing system for the user is that they can use the type as a guarantee that it is safe to do certain operations on a certain variable (and leads to predictable results). For that use case it is OK, if the typing system might not be able to tell if a value complies with the type (and resorts to `Any`). But what we see here is different: The typing system gives contradictive information about whether the value complies with the type depending on the context. Note, that it does not say: "This value might comply with this TypedDict or not, I can't tell." That would be OK. But it is different: On a sunny Monday before lunch the system claims: "Yes, this value complies with the type", and on a rainy Tuesday after teatime it states: "No, it doesn't". If types do not correspond to sets of values unambiguously, the benefit of the type-checker is broken for the user. 2. Another main question was about actual use cases. Often, the code does not care about extra fields. Developers will often want to silently ignore extra fields. The use case is writing a test that this ignorance is a safe strategy and that the code is robust against extra keys. This use case is not limited to the boundaries of the code (validating incoming data). As values with extra fields could still comply through a subtype, this could happen everywhere in the code: ```python class Data(TypedDict): x: int y: int def sum_data(data: Data): trimmed = {key: value for key in get_annotations(Data).keys()} return sum(trimmed.values()) def test_sum_data(): test_data: Data = {"x": 0, "y": 0, "z": 1} # this throws a typing error assert sum_data(test_data) == 0 ``` `data` being typed `Data` does not effectively prevent extra keys from being present on `data`. Writing the test that `sum_data` is robust against additional keys, should be straight-forward, but it is not. 3. There were detailed concerns about certain ways to change the behavior of TypedDicts. I think, it makes sense to postpone these questions until we reach consensus first on the question whether that this is a real problem and second that there are valid use cases where this problem unfolds. Am 13.03.2022 21:14 schrieb Jelle Zijlstra:
El dom, 13 mar 2022 a las 12:51, <j.scholbach@posteo.de> escribió:
I have started a draft for a PEP for this. It is my first draft of a PEP. Any feedback and input on it is very much appreciated: https://github.com/jonathan-scholbach/peps/blob/main/pep-9999.rst [1] I have written the draft in ignorance of your message, Eric Traut. I answer here, and point to the draft, where it makes sense.
A) Typing of extra fields: I have taken this into consideration with some detailed reasoning on the draft. The baseline is that I find it hard to come up with a use case for that. The problems you line out for the inheritance of the __extra__ attribute (if it holds the value type constraint) could also be seen as a strengthener for the tendency to keep it simple (just a boolean flag). But if you could name a good use case for value type constraints on the extra fields, that would increase my understanding of the implications of this a lot.
I don't find this feature very compelling if there is no way to specify the type of the extra fields. TypedDicts in general already allow extra keys to exist (because they support structural subtyping), so if we can't say what the type of the extra keys is, we really don't gain much from this new feature.
Your proposed PEP doesn't say much about what specific operations are allowed on an extra=True TypedDict but not a regular TypedDict.
B) Inheritance behavior of `extra` (I think that name is better than `extra_fields`, the argument for this is on the draft, too): The idea is to conceptualize `extra` in close analogy to `total` (find the reasoning on the draft). `total` already showed the problem that inheritance could lead to slight inconsistency (https://bugs.python.org/issue38834 [2]). I agree there should be an `__extra__` dunder attribute on the TypedDict, which just behaves like a normal attribute under inheritance. But I still think, it would be sweet syntax to have `extra` as a parameter of the constructor instead of writing the dunder parameter in the dictionary definition. In particular, it is relevant that
```python class A(TypedDict): foo: str __extra__: str ```
is ambiguous: it could mean that a key `"__extra__"` with value type `str` would be enforced on the dictionary. But I agree with you that is important to handle the fact that
```python class A(TypedDict, extra=True): x: int
class B(TypedDict, extra=False): pass ```
leads to an inconsistency. The problem with this is that the class hierarchy would not be aligned with the type hierarchy any more. That is unexpected and a smell. For `total` this problem does not occur, as in something like
```python class A(TypedDict, total=True): a: int
class B(A, total=False): pass
B.__required_keys__ # frozenset({'a'}) ```
the child class's `total` specification has effect only on the keys that are added on the child class. The solution for `extra` would be to allow the inheriting class only to flip the value of `extra` from False to True when changing it. I think, that makes sense. I actually think, analogue behavior of `total` would also make sense, because I consider
```python class A(TypedDict, total=True): a: int b: str
class B(A, total=False): a: int
B.__required_keys__ # frozenset({'a', 'b'}) ```
a gotcha. But this is probably out of scope here.
C) Generics: This is a good point. I think it makes sense to discuss this, once A) is settled. I just don't know when a point could be considered "settled" :) I am sure this particular question is not settled yet (but far away from this), but any feedback on how these discussions are lead here, is very much appreciated. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ [3] Member address: jelle.zijlstra@gmail.com
Links: ------ [1] https://github.com/jonathan-scholbach/peps/blob/main/pep-9999.rst [2] https://bugs.python.org/issue38834 [3] https://mail.python.org/mailman3/lists/typing-sig.python.org/
It is common to think of types as descriptions of sets of values
This is definitely a common philosophy, and probably the most helpful when looking at the runtime use of types. There's another common view of types as representing "evidence" that all values have some property, and static type checking works more like this - we walk the code using typing rules and build little proofs that the types are valid. But we'd still expect transitivity in the evidence-oriented philosophy. The current behavior is a bit like having a logic where implications aren't transitive. For example, if: A := the value d is the evaluation of `{"x": 0, "y": 0}` B := the value d is compatible with type `Child` C := the value d is compatible with type `Parent` Then the current typing rules specify that having evidence of `(A => B)` and `(B => C)` does not constitute evidence of `(A => C)`. There's another school of thought that says typing rules are just arbitrary functions telling us whether a specific expression is compatible with a specific type. In this view, soundness and completeness are the only lenses we're allowed to use to evaluate typing rules, and we're always willing to give up completeness. Since transitivity is related to completeness (I think it's essentially the simplest possible completeness property), we are free to break it if we think there's some usefulness in doing so and therefore the best approach comes down to use cases. My view is that breaking transitivity is not good for the external data use case because it can confuse even expert users (as in this thread, where several users incorrectly concluded that TypedDict cannot be used for data where extra fields might be present). But there's also the TypedDict-as-cheap-classes use case, where more rigid validation of literals could be pretty helpful. I'm not sure how to weigh this use case, especially since newer code probably leans toward dataclasses instead. One thing we discussed was changing the error message to make the literal behavior clearer. One possible phrasing of this: "Extra fields are not permitted in literal assigments to TypedDicts, although at runtime they may be present in TypedDict values."
Speaking from the heart: I worry that this argument will never end, and spur ever longer, more esoteric posts. I haven't seen a good use case for allowing dict literals with extra fields. The use case brought up in the meeting was something like class Point: x: int y: int def validator(p: Point): if isinstance(p.get("x"), int) and isinstance(p.get("y"), int): return raise TypeError("not a point") def test(): validator({"x": 0, "y": 42, "z": -1}) # Rejected by type checkers but IMO this is a poorly designed API, it should be more like def validator(p: dict) -> Point: if isinstance(p.get("x"), int) and isinstance(p.get("y"), int): return cast(Point, p) raise TypeError("not a point") --Guido On Fri, Apr 8, 2022 at 9:13 AM Steven Troxler <steven.troxler@gmail.com> wrote:
It is common to think of types as descriptions of sets of values
This is definitely a common philosophy, and probably the most helpful when looking at the runtime use of types.
There's another common view of types as representing "evidence" that all values have some property, and static type checking works more like this - we walk the code using typing rules and build little proofs that the types are valid.
But we'd still expect transitivity in the evidence-oriented philosophy. The current behavior is a bit like having a logic where implications aren't transitive. For example, if: A := the value d is the evaluation of `{"x": 0, "y": 0}` B := the value d is compatible with type `Child` C := the value d is compatible with type `Parent` Then the current typing rules specify that having evidence of `(A => B)` and `(B => C)` does not constitute evidence of `(A => C)`.
There's another school of thought that says typing rules are just arbitrary functions telling us whether a specific expression is compatible with a specific type.
In this view, soundness and completeness are the only lenses we're allowed to use to evaluate typing rules, and we're always willing to give up completeness. Since transitivity is related to completeness (I think it's essentially the simplest possible completeness property), we are free to break it if we think there's some usefulness in doing so and therefore the best approach comes down to use cases.
My view is that breaking transitivity is not good for the external data use case because it can confuse even expert users (as in this thread, where several users incorrectly concluded that TypedDict cannot be used for data where extra fields might be present).
But there's also the TypedDict-as-cheap-classes use case, where more rigid validation of literals could be pretty helpful. I'm not sure how to weigh this use case, especially since newer code probably leans toward dataclasses instead.
One thing we discussed was changing the error message to make the literal behavior clearer.
One possible phrasing of this: "Extra fields are not permitted in literal assigments to TypedDicts, although at runtime they may be present in TypedDict values." _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: guido@python.org
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
I hope it's not considered rude to "necro" a mailing list thread like this. But I can think of two big use cases that are very common in legacy code, including parts of the Python standard library. (Example 1) Supporting the full schema of logging.config.dictConfig. See: https://github.com/python/typeshed/blob/0259068/stdlib/logging/config.pyi#L2... (Example 2) Passing around kwargs. Adapted from https://github.com/python/typeshed/blob/0259068/stdlib/logging/__init__.pyi#... ``` class LoggerAdapter(Generic[_L]): logger: _L def process(self, msg: Any, kwargs: MutableMapping[str, Any]) -> tuple[Any, MutableMapping[str, Any]]: ... ``` Imagine if instead we could have this: ``` _LoggerKwargs(TypedDict, open=True): exc_info: bool stack_info: bool extra: dict[str, Any] class LoggerAdapter(Generic[_A, _B, _L]): logger: _L def process(self, msg: _A, kwargs: _LoggerKwargs) -> tuple[_B, _LoggerKwargs]: ... ``` If "open" TypedDicts are considered immutable, then this has the added benefit of discouraging mutation of the kwargs dict, which has always been a weird thing to do in my opinion, and is now generally unnecessary now that we have & and | for dicts.
One possible phrasing of this: "Extra fields are not permitted in literal assigments to TypedDicts, although at runtime they may be present in TypedDict values."
I would certainly support this clarification in documentation related to TypedDict. As a completely separate idea, it may be useful to make it easy to extract the extra field values from a runtime value of a TypedDict, and to explicitly copy those field values to a new TypedDict instance. Consider the following: class Point2D(TypedDict): x: int y: int class NamedPoint2D(Point2D): name: str n_point = NamedPoint2D(x=1, y=2, name='Center') P = TypeVar('P', Point2D) # Does swap the "x" and "y" components of a Point2D, # while preserving its __extra__ fields. def transpose(p: P) -> P: return Point2D( # maybe write P( here instead? x=p['y'], y=p['x'], **Point2D.__extra__(p)) transposed_n_point = transpose(n_point) print(transposed_n_point) # prints: {"x": 2, "y": 1, "name": "Center"} reveal_type(transposed_n_point) # NamedPoint2D The above code (1) uses a new @classmethod called __extra__() on the Point2D TypedDict that extracts any fields from the provided value that are not in Point2D's definition, and (2) allows a Point2D TypedDict to be constructed with a ** splat with statically-unknown extra fields. -- David Foster | Seattle, WA, USA Contributor to TypedDict, mypy, and Python's typing system
On Sun, Apr 17, 2022 at 03:22 David Foster <davidfstr@gmail.com> wrote:
One possible phrasing of this: "Extra fields are not permitted in literal assigments to TypedDicts, although at runtime they may be present in TypedDict values."
I would certainly support this clarification in documentation related to TypedDict.
In docs, sure. As a completely separate idea, it may be useful to make it easy to
extract the extra field values from a runtime value of a TypedDict, and to explicitly copy those field values to a new TypedDict instance. Consider the following:
class Point2D(TypedDict): x: int y: int
class NamedPoint2D(Point2D): name: str
n_point = NamedPoint2D(x=1, y=2, name='Center')
P = TypeVar('P', Point2D)
# Does swap the "x" and "y" components of a Point2D, # while preserving its __extra__ fields. def transpose(p: P) -> P: return Point2D( # maybe write P( here instead? x=p['y'], y=p['x'], **Point2D.__extra__(p))
transposed_n_point = transpose(n_point) print(transposed_n_point) # prints: {"x": 2, "y": 1, "name": "Center"} reveal_type(transposed_n_point) # NamedPoint2D
The above code (1) uses a new @classmethod called __extra__() on the Point2D TypedDict that extracts any fields from the provided value that are not in Point2D's definition, and (2) allows a Point2D TypedDict to be constructed with a ** splat with statically-unknown extra fields.
I’d like to see some use cases for such syntax before we go there. And why would __extra__ need to be a Point2D method?
-- --Guido (mobile)
As a completely separate idea, it may be useful to make it easy to extract the extra field values from a runtime value of a TypedDict, and to explicitly copy those field values to a new TypedDict instance. Consider the following:
class Point2D(TypedDict): x: int y: int
class NamedPoint2D(Point2D): name: str
n_point = NamedPoint2D(x=1, y=2, name='Center')
P = TypeVar('P', Point2D)
# Does swap the "x" and "y" components of a Point2D, # while preserving its __extra__ fields. def transpose(p: P) -> P: return Point2D( # maybe write P( here instead? x=p['y'], y=p['x'], **Point2D.__extra__(p))
transposed_n_point = transpose(n_point) print(transposed_n_point) # prints: {"x": 2, "y": 1, "name": "Center"} reveal_type(transposed_n_point) # NamedPoint2D
The above code (1) uses a new @classmethod called __extra__() on the Point2D TypedDict that extracts any fields from the provided value that are not in Point2D's definition, and (2) allows a Point2D TypedDict to be constructed with a ** splat with statically-unknown extra fields.
I’d like to see some use cases for such syntax before we go there.
The "transpose" function above is a scenario. The related general use case is "I want to generate a derived version of a TypedDict instance, completely redefining all known field values, but also preserve any extra fields.". This "copy-on-write" pattern seems to be a relatively common in coding styles that avoid mutating data structures directly, such as in Clojure. To be fair the TypedDicts I use in my own code typically only exist for a short time before they are JSON-dumped or after they are JSON-parsed and then discarded (in a request-response cycle of a web app), so my own code doesn't typically have TypedDict instances that live long enough to make it useful to support fancy mutations on them.
And why would __extra__ need to be a Point2D method?
Well the implementation needs to be some kind of function that takes (1) the TypedDict instance and (2) the TypedDict type (like Point2D), because the type is erased from the instance at runtime. In particular you could *not* write: p.__extra__() # an instance method call because "p" is a dict at runtime rather than a Point2D. You *could* write a freestanding function like: typing.get_typeddict_extras(p, Point2D) but it seems more succinct to just make it a class method on Point2D: Point2D.__extras__(p) # use dunder to avoid clash with field names or equivalently: Point2D._get_extras(p) # use underscore to avoid clash with field names -- David Foster | Seattle, WA, USA Contributor to TypedDict, mypy, and Python's typing system
On Tue, Apr 19, 2022 at 5:47 AM David Foster <davidfstr@gmail.com> wrote:
As a completely separate idea, it may be useful to make it easy to extract the extra field values from a runtime value of a TypedDict,
and
to explicitly copy those field values to a new TypedDict instance. Consider the following:
class Point2D(TypedDict): x: int y: int
class NamedPoint2D(Point2D): name: str
n_point = NamedPoint2D(x=1, y=2, name='Center')
P = TypeVar('P', Point2D)
# Does swap the "x" and "y" components of a Point2D, # while preserving its __extra__ fields. def transpose(p: P) -> P: return Point2D( # maybe write P( here instead? x=p['y'], y=p['x'], **Point2D.__extra__(p))
transposed_n_point = transpose(n_point) print(transposed_n_point) # prints: {"x": 2, "y": 1, "name":
"Center"}
reveal_type(transposed_n_point) # NamedPoint2D
The above code (1) uses a new @classmethod called __extra__() on the Point2D TypedDict that extracts any fields from the provided value
that
are not in Point2D's definition, and (2) allows a Point2D TypedDict
to
be constructed with a ** splat with statically-unknown extra fields.
I’d like to see some use cases for such syntax before we go there.
The "transpose" function above is a scenario. The related general use case is "I want to generate a derived version of a TypedDict instance, completely redefining all known field values, but also preserve any extra fields.". This "copy-on-write" pattern seems to be a relatively common in coding styles that avoid mutating data structures directly, such as in Clojure.
Hm, that sounds like a rather theoretical "use case". I don't think we should try to add features to TypedDict to encourage its use for new coding styles, even though those styles are popular in other languages. TypedDict was accepted into the type system because we observed a pattern using dicts instead of classes in legacy code and felt it was important to be able to type-check such code. To be fair the TypedDicts I use in my own code typically only exist for
a short time before they are JSON-dumped or after they are JSON-parsed and then discarded (in a request-response cycle of a web app), so my own code doesn't typically have TypedDict instances that live long enough to make it useful to support fancy mutations on them.
Okay, so the use case isn't real in your code.
And why would __extra__ need to be a Point2D method?
Well the implementation needs to be some kind of function that takes (1) the TypedDict instance and (2) the TypedDict type (like Point2D), because the type is erased from the instance at runtime.
In particular you could *not* write: p.__extra__() # an instance method call because "p" is a dict at runtime rather than a Point2D.
You *could* write a freestanding function like: typing.get_typeddict_extras(p, Point2D) but it seems more succinct to just make it a class method on Point2D: Point2D.__extras__(p) # use dunder to avoid clash with field names or equivalently: Point2D._get_extras(p) # use underscore to avoid clash with field names
But then every TypedDict-derived type would have that class method. A generic function makes more sense to me for this API design. But I am far from convinced that we need it. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
For what it's worth, at runtime it will work fine to do this with: ``` def transpose(p: P) -> P: return { **p, "x": p['y'], "y": p['x'], } ``` MyPy complains about this but Pyre and Pyright both accept it.
On 4/19/22 11:24 AM, Steven Troxler wrote:
For what it's worth, at runtime it will work fine to do this with: ``` def transpose(p: P) -> P: return { **p, "x": p['y'], "y": p['x'], } ```
Thanks Steven! This is a much better solution to the "copy on write" use case. Succinct and works today. I wasn't aware you could put individual keys after a ** splat in a dictionary literal.
MyPy complains about this but Pyre and Pyright both accept it.
Hmm, true (for the MyPy case). I don't think MyPy understands splats (**) inside dict literals that are intended to be used as a TypedDict. I expect it would take only a moderate amount of effort to add support if this notation became popular to use. Cheers, -- David Foster | Seattle, WA, USA Contributor to TypedDict, mypy, and Python's typing system
I really want this feature for the scenario where you are getting data from some API/library that doesn’t fully describe the payload but you know one or two keys that you use. For example, boto3 returns a bunch of untyped dicts but I know that I want payload[“Key”] and that it is a string. I see that one of the major issues that were brought up was describing the type of the “extra” keys (even if that is Any). Could we do this by allowing classes to subclass both TypedDict and Dict? ``` from typing import Dict, TypedDict class Coefficients(Dict[str, float], TypedDict, allow_extra=True): x1: int x2: int f1: Coefficients = {"x1": 1, "x2": 2, "x3": 3.3} # ok f2: Coefficients = {"x1": "bad", "x2": 2, "x3": 3.3} # bad, x1 has wrong type f3: Coefficients = {"x1": 1, "x2": 2, "x3": "bad"} # bad, x3 has wrong type ``` I’m no type theorist but I think this transmits the idea that this is a dict with str keys and float values where some specific keys are narrowed to int (or it could be Dict[str, Any] with some keys of known types). I’m not sure what the right thing to do is w.r.t. mutability, but I suppose this could be restricted to be Mapping instead of Dict if we wanted to only allow this pattern for immutable objects?
I don't like this solution for this reason: in the example, Coefficients is not a subtype of Dict[str, float] even though it is subclassing it. I don't have strong feelings against solving this problem, but I think this particular solution would be problematic, and I do agree with Guido's comment above:
I don't think we should try to add features to TypedDict to encourage its use for new coding styles, even though those styles are popular in other languages.
And in extension of that, I personally think boto3 should be encouraged to transition away from using dicts.
Would anyone like to write a pep adding this functionality? For context Typescript has the ability to constrain the types of additional properties in this way. I would be happy to collaborate with someone.
participants (17)
-
Adrian Garcia Badaracco
-
Anton Agestam
-
Brett Cannon
-
Cohen Karnell
-
David Foster
-
Elazar
-
Eric Traut
-
Greg Werbin
-
Guido van Rossum
-
j.scholbach@posteo.de
-
Jelle Zijlstra
-
Justin Black
-
Mehdi2277
-
Shantanu Jain
-
Steven Troxler
-
Tuomas Suutari
-
Tuomas Suutari