It’s a usability issue; mappings are used quite differently than sequences. Compare to class patterns rather than sequence patterns.
From PEP 636 (Structural Pattern Matching):
> Mapping patterns: {"bandwidth": b, "latency": l} captures the
"bandwidth" and "latency" values from a dict. Unlike sequence patterns,
extra keys are ignored.
It surprises me that ignoring extra keys would be the *default*
behavior. This seems unsafe. Extra keys I would think would be best
treated as suspicious by default.
* Ignoring extra keys loses data silently. In the current proposal:
point = {'x': 1, 'y': 2, 'z': 3)
match point:
case {'x': x, 'y': y}: # MATCHES, losing z O_O
pass
case {'x': x, 'y': y, 'z': z}: # will never match O_O
pass
* Ignoring extra keys is inconsistent with the handling of sequences: We
don't allow extra items when using a destructuring assignment to a sequence:
p = [1, 2]
[x, y] = p
[x, y, z] = p # ERROR: ValueError: not enough values to unpack
(expected 3, got 2) :)
* Ignoring extra keys in mapping patterns is inconsistent with the
current proposal for how sequence patterns match data:
point = [1, 2, 3]
match point:
case [x, y]: # notices extra value and does NOT match :)
pass
case [x, y, z]: # matches :)
pass
* Ignoring extra keys is inconsistent with TypedDict's default "total"
matching behavior:
from typing import TypedDict
class Point2D(TypedDict):
x: int
y: int
p1: Point2D = {'x': 1, 'y': 2}
p2: Point2D = {'x': 1, 'y': 2, 'z': 3) # ERROR: Extra key 'z' for
TypedDict "Point2D" :)
* It is *possible* to force an exact key match with a pattern guard but
it's clumsy to do so.
It should not be clumsy to parse strictly.
point = {'x': 1, 'y': 2, 'z': 3)
match point:
# notices extra value and does NOT match, but requires ugly
guard :/
case {'x': x, 'y': y, **rest} if rest == {}:
pass
case {'x': x, 'y': y, 'z': z, **rest} if rest == {}:
pass
To avoid the above problems, **I'd advocate for disallowing extra keys
in mapping patterns by default**. For cases where extra keys want to be
specifically allowed and ignored, I propose allowing a **_ wildcard.
Some examples that illustrate behavior when *disallowing* extra keys in
mapping patterns:
1. Strict parsing
from typing import TypedDict, Union
Point2D = TypedDict('Point2D', {'x': int, 'y': int})
Point3D = TypedDict('Point3D', {'x': int, 'y': int, 'z': int})
def parse_point(point_json: dict) -> Union[Point2D, Point3D]:
match point_json:
case {'x': int(x), 'y': int(y)}:
return Point2D({'x': x, 'y': y})
case {'x': int(x), 'y': int(y), 'z': int(z)}:
return Point3D({'x': x, 'y': y, 'z': z})
case _:
raise ValueError(f'not a valid point: {point_json!r}')
2. Loose parsing, discarding unknown data.
Common when reading JSON-like data when it's not necessary to output
it again later.
from typing import TypedDict
TodoItem_ReadOnly = TypedDict('TodoItem_ReadOnly', {'title': str,
'completed': bool})
def parse_todo_item(todo_item_json: Dict) -> TodoItem_ReadOnly:
match todo_item_json:
case {'title': str(title), 'completed': bool(completed), **_}:
return TodoItem_ReadOnly({'title': title, 'completed':
completed})
case _:
raise ValueError()
input = {'title': 'Buy groceries', 'completed': True,
'assigned_to': ['me']}
print(parse_todo_item(input)) # prints: {'title': 'Buy groceries',
'completed': True}
3. Loose parsing, preserving unknown data.
Common when parsing JSON-like data when it needs to be round-tripped
and output again later.
from typing import Any, Dict, TypedDict
TodoItem_ReadWrite = TypedDict('TodoItem_ReadWrite', {'title': str,
'completed': bool, 'extra': Dict[str, Any]})
def parse_todo_item(todo_item_json: Dict) -> TodoItem_ReadWrite:
match todo_item_json:
case {'title': str(title), 'completed': bool(completed),
**extra}:
return TodoItem_ReadWrite({'title': title, 'completed':
completed, 'extra': extra})
case _:
raise ValueError()
def format_todo_item(item: TodoItem_ReadWrite) -> Dict:
return {'title': item['title'], 'completed': item['completed'],
**item['extra']}
input = {'title': 'Buy groceries', 'completed': True,
'assigned_to': ['me']}
output = format_todo_item(parse_todo_item(input))
print(output) # prints: {'title': 'Buy groceries', 'completed':
True, 'assigned_to': ['me']}
Comments?
--
David Foster | Seattle, WA, USA
Contributor to TypedDict support for mypy
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ZPUVT7AF67VKNLSSGUHOBIM5F46ZEE77/
Code of Conduct: http://python.org/psf/codeofconduct/
--
--Guido (mobile)