I am excited about the potential of the new PEP 634-636 "match" statement to match JSON data received by Python web applications. Often this JSON data is in the form of structured dictionaries (TypedDicts) containing Lists and other primitives (str, float, bool, None). PEP 634-636 already contain the ability to match all of those underlying data types except for TypedDicts, so I'd like to explore what it might look like to match a TypedDict... Consider an example web application that wants to provide a service to draw shapes, perhaps on a connected physical billboard. The service has a '/draw_shape' endpoint which takes a JSON object (a Shape TypedDict) describing a shape to draw: from bottle import HTTPResponse, request, route from typing import Literal, TypedDict, Union class Point2D(TypedDict): x: float y: float class Circle(TypedDict): type: Literal['circle'] center: Point2D # has a nested TypedDict! radius: float class Rect(TypedDict): type: Literal['rect'] x: float y: float width: float height: float Shape = Union[Circle, Rect] # a Tagged Union / Discriminated Union @route('/draw_shape') def draw_shape() -> None: match request.json: # a Shape? ... case _: return HTTPResponse(status=400) # Bad Request Now, what syntax could we have at the ... inside the "match" statement to effectively pull apart a Shape? The current version of PEP 634-636 would require duplicating all the keys and value types that are defined in Shape's underlying Circle and Rect types: match request.json: # a Shape? case {'type': 'circle', 'center': {'x': float(), 'y': float()}, \ radius: float()} as circle: draw_circle(circle) # type is inferred as Circle case {'type': 'rect', 'x': float(), 'y': float(), \ 'width': float(), 'height': float()} as rect: draw_rect(rect) # type is inferred as Rect case _: return HTTPResponse(status=400) # Bad Request Wouldn't it be nicer if we could use class patterns instead? match request.json: # a Shape? case Circle() as circle: draw_circle(circle) case Rect() as rect: draw_rect(rect) case _: return HTTPResponse(status=400) # Bad Request Now that syntax almost works except that Circle and Rect, being TypedDicts, do not support isinstance() checks. PEP 589 ("TypedDict") did not define how such isinstance() checks should work initially because it's somewhat complex to specify. From the PEP:
In particular, TypedDict type objects cannot be used in isinstance() tests such as isinstance(d, Movie). The reason is that there is no existing support for checking types of dictionary item values, since isinstance() does not work with many PEP 484 types, including common ones like List[str]. [...] This is consistent with how isinstance() is not supported for List[str].
Well, what if we (or I) took the time to specify how isinstance() worked with TypedDict? Then the match syntax above with TypedDict as a class pattern would work! Refining the example above even further, it would be nice if we didn't have to enumerate all the different types of Shapes directly in the match-statement. What if we could match on a Shape directly? match request.json: # a Shape? case Shape() as shape: draw_shape(shape) case _: return HTTPResponse(status=400) # Bad Request Now for that syntax to work it must be possible for an isinstance() check to work on a Shape, which is defined to be a Union[Circle, Rect], and isinstance() checks also aren't currently defined for Union types. So it would be useful to define isinstance() for Union types as well. Of course that match-statement is now simple enough to just be rewriten as an if-statement: if isinstance(shape := request.json, Shape): draw_shape(shape) else: return HTTPResponse(status=400) # Bad Request Now *that* is a wonderfully short bit of parsing code that results in well-typed objects as output. 🎉 So to summarize, I believe it's possible to support really powerful matching on JSON objects if we just define how isinstance() should work with a handful of new types. In particular the above example would work if isinstance() was defined for: * Union[T1, T2], Optional[T] * T extends TypedDict * Literal['foo'] For arbitrary JSON beyond the example, we'd also want to support isinstance() for: * List[T] We already support isinstance() for the other JSON primitive types: * str * float * bool * type(None) So what do folks think? If I were to start writing a PEP to extend isinstance() to cover at least the above cases, would that be welcome? -- David Foster | Seattle, WA, USA Contributor to TypedDict support for mypy
You nerd-sniped me there. :-) I think this is perhaps too complicated to attempt to make it all work. - We intentionally don't support things like `isinstance(x, List[str])` because that would require checking all the items with `isinstance(item, str)`, and that seems a speed trap. Reverting this decision would be hard work. - `isinstance(3, float)` returns False, because at runtime int and float are distinct types, even though static type checkers regard int as a subtype of float (with some caveats, at least in mypy's case, but that's basically how it works). Changing this behavior at runtime will break tons of existing code that tries to distinguish ints from floats but checks for float first (since everyone "knows" they are distinct types), so this would be even harder to get through than the previous bullet. - Currently static type checkers don't allow defining any methods (even class methods) in TypedDict instances, so you can't manually an `__instancecheck__` method to a TypedDict class. (But we could add it to typing.TypedDict of course.) - There's also the issue that bool is a subtype of int. Again, very hard to change that without breaking code. - Some static type checkers (mypy, but not pyright -- haven't tried others yet) disallow `isinstance(x, SomeTypedDict)` -- presumably because they are aware of the problems above. Probably the best you can do is write your own recursive isinstance-lookalike that has the behavior you need for validating JSON. But then you're no better off than any other JSON validation framework (and I expect there already to be some that introspect TypedDict subclasses). I suppose you could come up with some mechanism whereby you can create a parallel hierarchy of classes that do support isinstance(), so you could write e.g. ICircle = make_interface(Circle) IRect = make_interface(Rect) # etc. def draw_shape(): match request.json: case ICircle(center, radius): ... case IRect(x, y, width, height): ... ... but this loses much of the original attractiveness. --Guido On Sat, Nov 21, 2020 at 10:46 PM David Foster <davidfstr@gmail.com> wrote:
I am excited about the potential of the new PEP 634-636 "match" statement to match JSON data received by Python web applications. Often this JSON data is in the form of structured dictionaries (TypedDicts) containing Lists and other primitives (str, float, bool, None).
PEP 634-636 already contain the ability to match all of those underlying data types except for TypedDicts, so I'd like to explore what it might look like to match a TypedDict...
Consider an example web application that wants to provide a service to draw shapes, perhaps on a connected physical billboard.
The service has a '/draw_shape' endpoint which takes a JSON object (a Shape TypedDict) describing a shape to draw:
from bottle import HTTPResponse, request, route from typing import Literal, TypedDict, Union
class Point2D(TypedDict): x: float y: float
class Circle(TypedDict): type: Literal['circle'] center: Point2D # has a nested TypedDict! radius: float
class Rect(TypedDict): type: Literal['rect'] x: float y: float width: float height: float
Shape = Union[Circle, Rect] # a Tagged Union / Discriminated Union
@route('/draw_shape') def draw_shape() -> None: match request.json: # a Shape? ... case _: return HTTPResponse(status=400) # Bad Request
Now, what syntax could we have at the ... inside the "match" statement to effectively pull apart a Shape?
The current version of PEP 634-636 would require duplicating all the keys and value types that are defined in Shape's underlying Circle and Rect types:
match request.json: # a Shape? case {'type': 'circle', 'center': {'x': float(), 'y': float()}, \ radius: float()} as circle: draw_circle(circle) # type is inferred as Circle case {'type': 'rect', 'x': float(), 'y': float(), \ 'width': float(), 'height': float()} as rect: draw_rect(rect) # type is inferred as Rect case _: return HTTPResponse(status=400) # Bad Request
Wouldn't it be nicer if we could use class patterns instead?
match request.json: # a Shape? case Circle() as circle: draw_circle(circle) case Rect() as rect: draw_rect(rect) case _: return HTTPResponse(status=400) # Bad Request
Now that syntax almost works except that Circle and Rect, being TypedDicts, do not support isinstance() checks. PEP 589 ("TypedDict") did not define how such isinstance() checks should work initially because it's somewhat complex to specify. From the PEP:
In particular, TypedDict type objects cannot be used in isinstance() tests such as isinstance(d, Movie). The reason is that there is no existing support for checking types of dictionary item values, since isinstance() does not work with many PEP 484 types, including common ones like List[str]. [...] This is consistent with how isinstance() is not supported for List[str].
Well, what if we (or I) took the time to specify how isinstance() worked with TypedDict? Then the match syntax above with TypedDict as a class pattern would work!
Refining the example above even further, it would be nice if we didn't have to enumerate all the different types of Shapes directly in the match-statement. What if we could match on a Shape directly?
match request.json: # a Shape? case Shape() as shape: draw_shape(shape) case _: return HTTPResponse(status=400) # Bad Request
Now for that syntax to work it must be possible for an isinstance() check to work on a Shape, which is defined to be a Union[Circle, Rect], and isinstance() checks also aren't currently defined for Union types. So it would be useful to define isinstance() for Union types as well.
Of course that match-statement is now simple enough to just be rewriten as an if-statement:
if isinstance(shape := request.json, Shape): draw_shape(shape) else: return HTTPResponse(status=400) # Bad Request
Now *that* is a wonderfully short bit of parsing code that results in well-typed objects as output. 🎉
So to summarize, I believe it's possible to support really powerful matching on JSON objects if we just define how isinstance() should work with a handful of new types.
In particular the above example would work if isinstance() was defined for: * Union[T1, T2], Optional[T] * T extends TypedDict * Literal['foo']
For arbitrary JSON beyond the example, we'd also want to support isinstance() for: * List[T]
We already support isinstance() for the other JSON primitive types: * str * float * bool * type(None)
So what do folks think? If I were to start writing a PEP to extend isinstance() to cover at least the above cases, would that be welcome?
-- David Foster | Seattle, WA, USA Contributor to TypedDict support for mypy _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/Y2EJEZ... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
Probably the best you can do is write your own recursive isinstance-lookalike that has the behavior you need for validating JSON.
This is already done in the form of pydantic <https://pydantic-docs.helpmanual.io/> which is the backbone of one of the fastest growing python web frameworks, fastAPI <https://fastapi.tiangolo.com/>. (Admission: I'm one of the authors of pydantic) Pydantic doesn't yet support TypedDict, but there's an issue <https://github.com/samuelcolvin/pydantic/issues/760> to implement it. --- More generally, I think runtime type checking was never the intention for type hints, in fact I think Guido specifically said somewhere that runtime type checking was not an intended use case (am I right?). However pydantic and a few other projects seem to me to have been very successful in using them for that anyway. I'd love it if python's developers could be a bit more supportive to runtime type inspection in future. Match statements look very interesting, thanks for alerting me to the PEP. Samuel -- Samuel Colvin On Sun, 22 Nov 2020 at 18:17, Guido van Rossum <guido@python.org> wrote:
You nerd-sniped me there. :-)
I think this is perhaps too complicated to attempt to make it all work.
- We intentionally don't support things like `isinstance(x, List[str])` because that would require checking all the items with `isinstance(item, str)`, and that seems a speed trap. Reverting this decision would be hard work.
- `isinstance(3, float)` returns False, because at runtime int and float are distinct types, even though static type checkers regard int as a subtype of float (with some caveats, at least in mypy's case, but that's basically how it works). Changing this behavior at runtime will break tons of existing code that tries to distinguish ints from floats but checks for float first (since everyone "knows" they are distinct types), so this would be even harder to get through than the previous bullet.
- Currently static type checkers don't allow defining any methods (even class methods) in TypedDict instances, so you can't manually an `__instancecheck__` method to a TypedDict class. (But we could add it to typing.TypedDict of course.)
- There's also the issue that bool is a subtype of int. Again, very hard to change that without breaking code.
- Some static type checkers (mypy, but not pyright -- haven't tried others yet) disallow `isinstance(x, SomeTypedDict)` -- presumably because they are aware of the problems above.
Probably the best you can do is write your own recursive isinstance-lookalike that has the behavior you need for validating JSON. But then you're no better off than any other JSON validation framework (and I expect there already to be some that introspect TypedDict subclasses).
I suppose you could come up with some mechanism whereby you can create a parallel hierarchy of classes that do support isinstance(), so you could write e.g.
ICircle = make_interface(Circle) IRect = make_interface(Rect) # etc.
def draw_shape(): match request.json: case ICircle(center, radius): ... case IRect(x, y, width, height): ... ...
but this loses much of the original attractiveness.
--Guido
On Sat, Nov 21, 2020 at 10:46 PM David Foster <davidfstr@gmail.com> wrote:
I am excited about the potential of the new PEP 634-636 "match" statement to match JSON data received by Python web applications. Often this JSON data is in the form of structured dictionaries (TypedDicts) containing Lists and other primitives (str, float, bool, None).
PEP 634-636 already contain the ability to match all of those underlying data types except for TypedDicts, so I'd like to explore what it might look like to match a TypedDict...
Consider an example web application that wants to provide a service to draw shapes, perhaps on a connected physical billboard.
The service has a '/draw_shape' endpoint which takes a JSON object (a Shape TypedDict) describing a shape to draw:
from bottle import HTTPResponse, request, route from typing import Literal, TypedDict, Union
class Point2D(TypedDict): x: float y: float
class Circle(TypedDict): type: Literal['circle'] center: Point2D # has a nested TypedDict! radius: float
class Rect(TypedDict): type: Literal['rect'] x: float y: float width: float height: float
Shape = Union[Circle, Rect] # a Tagged Union / Discriminated Union
@route('/draw_shape') def draw_shape() -> None: match request.json: # a Shape? ... case _: return HTTPResponse(status=400) # Bad Request
Now, what syntax could we have at the ... inside the "match" statement to effectively pull apart a Shape?
The current version of PEP 634-636 would require duplicating all the keys and value types that are defined in Shape's underlying Circle and Rect types:
match request.json: # a Shape? case {'type': 'circle', 'center': {'x': float(), 'y': float()}, \ radius: float()} as circle: draw_circle(circle) # type is inferred as Circle case {'type': 'rect', 'x': float(), 'y': float(), \ 'width': float(), 'height': float()} as rect: draw_rect(rect) # type is inferred as Rect case _: return HTTPResponse(status=400) # Bad Request
Wouldn't it be nicer if we could use class patterns instead?
match request.json: # a Shape? case Circle() as circle: draw_circle(circle) case Rect() as rect: draw_rect(rect) case _: return HTTPResponse(status=400) # Bad Request
Now that syntax almost works except that Circle and Rect, being TypedDicts, do not support isinstance() checks. PEP 589 ("TypedDict") did not define how such isinstance() checks should work initially because it's somewhat complex to specify. From the PEP:
In particular, TypedDict type objects cannot be used in isinstance() tests such as isinstance(d, Movie). The reason is that there is no existing support for checking types of dictionary item values, since isinstance() does not work with many PEP 484 types, including common ones like List[str]. [...] This is consistent with how isinstance() is not supported for List[str].
Well, what if we (or I) took the time to specify how isinstance() worked with TypedDict? Then the match syntax above with TypedDict as a class pattern would work!
Refining the example above even further, it would be nice if we didn't have to enumerate all the different types of Shapes directly in the match-statement. What if we could match on a Shape directly?
match request.json: # a Shape? case Shape() as shape: draw_shape(shape) case _: return HTTPResponse(status=400) # Bad Request
Now for that syntax to work it must be possible for an isinstance() check to work on a Shape, which is defined to be a Union[Circle, Rect], and isinstance() checks also aren't currently defined for Union types. So it would be useful to define isinstance() for Union types as well.
Of course that match-statement is now simple enough to just be rewriten as an if-statement:
if isinstance(shape := request.json, Shape): draw_shape(shape) else: return HTTPResponse(status=400) # Bad Request
Now *that* is a wonderfully short bit of parsing code that results in well-typed objects as output. 🎉
So to summarize, I believe it's possible to support really powerful matching on JSON objects if we just define how isinstance() should work with a handful of new types.
In particular the above example would work if isinstance() was defined for: * Union[T1, T2], Optional[T] * T extends TypedDict * Literal['foo']
For arbitrary JSON beyond the example, we'd also want to support isinstance() for: * List[T]
We already support isinstance() for the other JSON primitive types: * str * float * bool * type(None)
So what do folks think? If I were to start writing a PEP to extend isinstance() to cover at least the above cases, would that be welcome?
-- David Foster | Seattle, WA, USA Contributor to TypedDict support for mypy _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/Y2EJEZ... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...> _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/AZ2JBL... Code of Conduct: http://python.org/psf/codeofconduct/
On Sun, Nov 22, 2020 at 1:55 PM Samuel Colvin <samcolvin@gmail.com> wrote:
More generally, I think runtime type checking was never the intention for type hints, in fact I think Guido specifically said somewhere that runtime type checking was not an intended use case (am I right?). However pydantic and a few other projects seem to me to have been very successful in using them for that anyway. I'd love it if python's developers could be a bit more supportive to runtime type inspection in future.
I've always held that it must be *possible* to introspect annotations at runtime, and I believe that we occasionally had to tweak static typing features to support this. But that's not the same as supporting isinstance() where the second argument is a generic type or some other special form like Any or Callable[.....]. That said, if you have a specific wish, please start a new thread on typing-sig and we'll discuss it. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On 11/22/20 10:15 AM, Guido van Rossum wrote:
- We intentionally don't support things like `isinstance(x, List[str])` because that would require checking all the items with `isinstance(item, str)`, and that seems a speed trap. Reverting this decision would be hard work.
Aye. I imagine many folks would expect isinstance() to be "fast" and altering it to do recursive checks on its argument would lose its current O(1) time. I imagine I *could* implement my own kind of isinstance() that would work on TypedDict values that would still be recognized by typecheckers. Say I write an implementation for the following method: from typing import Optional, Type, TypeVar, TypedDict TD = TypeVar(bound=TypedDict) def try_cast(type: Type[TD], value: object) -> Optional[TD]: """Returns `value` if it can be parsed as a `type`, otherwise None.""" raise NotImplementedError() Then I could use that method in a similar way as my earlier example to parse a value very concisely: if (shape := try_cast(Shape, request.json)) is not None: draw_shape(shape) # is narrowed to Shape else: return HTTPResponse(status=400) # Bad Request Going further, I could extend try_cast() to accept any (non-None) JSON-like value as the top-level object (not just TypedDicts): from typing import Dict, List, Optional, Type, TypeVar, TypedDict, Union TD = TypeVar('TD', bound=TypedDict) JsonValue = Union[ TD, Dict[str, 'OptionalJV'], List['OptionalJV'], Dict, # heterogeneous Dict List, # heterogeneous List float, int, # because json.loads may return an int when parsing a number str, bool, ] JV = TypeVar('JV', bound=JsonValue) OptionalJV = TypeVar('OptionalJV', bound=Union[JsonValue, None]) def try_cast(type: Type[JV], value: object) -> Optional[JV]: """Returns `value` if it can be parsed as a `type`, otherwise None.""" raise NotImplementedError() Now, I'm not sure if mypy can handle that kind of recursive TypedDict definition :), but it *will* work at runtime. I'll see about implementing a function like try_cast() as a separate package. This should be fun. :) -- David Foster | Seattle, WA, USA Contributor to TypedDict support for mypy
I recommend taking this to typing-sig... On Mon, Nov 23, 2020 at 19:18 David Foster <davidfstr@gmail.com> wrote:
On 11/22/20 10:15 AM, Guido van Rossum wrote:
- We intentionally don't support things like `isinstance(x, List[str])` because that would require checking all the items with `isinstance(item, str)`, and that seems a speed trap. Reverting this decision would be hard work.
Aye. I imagine many folks would expect isinstance() to be "fast" and altering it to do recursive checks on its argument would lose its current O(1) time.
I imagine I *could* implement my own kind of isinstance() that would work on TypedDict values that would still be recognized by typecheckers. Say I write an implementation for the following method:
from typing import Optional, Type, TypeVar, TypedDict
TD = TypeVar(bound=TypedDict)
def try_cast(type: Type[TD], value: object) -> Optional[TD]: """Returns `value` if it can be parsed as a `type`, otherwise None.""" raise NotImplementedError()
Then I could use that method in a similar way as my earlier example to parse a value very concisely:
if (shape := try_cast(Shape, request.json)) is not None: draw_shape(shape) # is narrowed to Shape else: return HTTPResponse(status=400) # Bad Request
Going further, I could extend try_cast() to accept any (non-None) JSON-like value as the top-level object (not just TypedDicts):
from typing import Dict, List, Optional, Type, TypeVar, TypedDict, Union
TD = TypeVar('TD', bound=TypedDict) JsonValue = Union[ TD, Dict[str, 'OptionalJV'], List['OptionalJV'], Dict, # heterogeneous Dict List, # heterogeneous List float, int, # because json.loads may return an int when parsing a number str, bool, ] JV = TypeVar('JV', bound=JsonValue) OptionalJV = TypeVar('OptionalJV', bound=Union[JsonValue, None])
def try_cast(type: Type[JV], value: object) -> Optional[JV]: """Returns `value` if it can be parsed as a `type`, otherwise None.""" raise NotImplementedError()
Now, I'm not sure if mypy can handle that kind of recursive TypedDict definition :), but it *will* work at runtime.
I'll see about implementing a function like try_cast() as a separate package. This should be fun. :)
-- David Foster | Seattle, WA, USA Contributor to TypedDict support for mypy
-- --Guido (mobile)
participants (3)
-
David Foster
-
Guido van Rossum
-
Samuel Colvin