Per-field syntax for defining potentially missing fields in TypedDict
I believe I was suggested to post this here (instead of at https://github.com/python/mypy/issues/9867). TLDR; Allow `Omittable[...]` annotation, to mark fields that may not exist. Current way of doing this, with `total=False`, is verbose and forces to break the natural grouping of data. As I understand, `total=False` was discussed/added in https://github.com/python/mypy/issues/2632. There's not much debate about the syntax there - the examples/suggestions there are `MayExist[...]`, `NotRequired[...]`, `Checked[...]`, `OptionalItem[...]` or (the other way around) `Required[...]`. However, none of these were considered good enough, so `total=False` got chosen instead. How about `Omittable[...]`? It's only +1 char to `Optional`, which would be perfect, but of course cannot be used here. Longer explanation for why we should need this below, or view the https://github.com/python/mypy/issues/9867 which has Markdown formatting. Though, Guido already said that "There is not much disagreement on the problem". --- We can define optional fields for `TypedDict` as in: class MyDictBase(TypedDict): required: str class MyDict(MyDict): optional: str Compare this to some other languages (like Typescript): interface Data { required: string; optional?: string; } The problem is with syntax here - the Python version is less intuitive, more verbose (for short declarations) and it breaks the "flow" of fields. **1. Less intuitive** When you think about: { "required1": "...", "optional1": "...", "required2": "..." } Why do you need to split up the required and optional fields of one single structure into 2 classes? It's just... not nice. **2. More verbose** While I understand that Python syntax saves characters when there are lots of optional fields, which would be represented like this: class MyDict(TypedDict): optional1: Checked[str] optional2: Checked[str] optional3: Checked[str] ... Omitting `Checked[]` we save 9 chars per field and the result looks nicer as: class MyDict(TypedDict, total=False): optional1: str optional2: str optional3: str ... However, there is certain limit, when this starts to make sense. If you only have a few fields, the drawback of having to break the data into two classes and splitting up fields from their context (see point 3 below) outweighs the benefit of saving a few characters. Furthermore, the problem of verbosity is oftentimes solved by `?` or similar in other languages: interface Data { required: string; optional1?: string; optional2?: string; optional3?: string; ... } This adds only single character per field, and is completely acceptable. Like we are now getting `|` for `Union`, maybe we'll get `?` some day, and it will solve the problem of verbosity. Even without, I would be glad to add some visual clutter to avoid having to split up my structure into two different sections. **3. Breaks the "flow" of fields** While the order of items in a dictionary-like structure is insignificant to a machine (though we got ordered dicts by default in 3.6+), it often has meaning to a human. For example, a structure in our application looks something like: { "schema": "some-schema", "v": 1, "msg_id": "...", "flow_id": "(O) ...", "category": "...", "type": "(O) ...", "created_on": "...", "created_by": "...", "updated_on": "(O) ...", "updated_by": "(O) ..." } As you can see, there are sort of field "groups" there. This grouping/ordering must be broken when defining a TypedDict for the structure: class StructBase(TypedDict): schema: str v: int msg_id: str category: str created_on: str created_by: str class Struct(StructBase, total=False): flow_id: str type: str updated_on: str updated_by: str The result is much more verbose, and more difficult to follow.
participants (10)
-
Daniel Moisset
-
David Foster
-
Dominik Gabi
-
Eric Traut
-
Guido van Rossum
-
Jelle Zijlstra
-
layday
-
Shantanu Jain
-
Tuukka Mustonen
-
Vito De Tullio