
On Mon, Sep 21, 2020 at 6:45 AM Sebastian Rittau <srittau@rittau.biz> wrote:
This has previously been discussed here: https://github.com/python/typing/issues/566.
In typeshed we have found that there a quite a few cases where functions can return two or more different, incompatible types, that can't be distinguished by the input types alone. A few examples:
* ip_address() can return either an IPv4Address or an IPv6Address, depending on the input string. * float ** float returns either a float or a complex. Currently, typeshed marks this as returning just float. * json.loads() et al. can return a selection of types, depending on the input. * urlopen() can return either an HTTPResponse or an addinfourl.
There are more examples in the linked issue.
It looks like there's two different categories of inputs: a serialized format (json, ip address, arg parsing, struct unpacking) and float pow operator? TBH, I don't see how a function accepting a serialized input that decodes to different output types could safely use this AnyOf concept. The example that springs to mind is JSON decoding, which IME frequently has unexpected values (untrusted user input; old/inconsistent data format (e.g. "date" was encoding as int timestamp, then later switched to ISO timestamp); etc), which causes problems much later on (call stack or calendar wise) in some unrelated code. But, fundamentally, the same is true for the other examples. In particular, the problematic case that pops into mind is how the type annotations of a helper function become untrustworthy. e.g. given this: def get_name(json_str) -> str: """Gets the username from the bla bla bla lots of lines""" return json.loads(json_str) Then all I see -- because all I see is usages of get_name(), or am skimming the code because I don't have time/desire to read the full implementation -- is "a function that'll give me the name from the json str", not "a function that *might* give me the name, but really just whatever object was in the json and I need to be aware of that". Because I generally expect functions to be well behaved and reject invalid input, so when it says it returns a str, I trust that it'll do that or an exception is going to happen. IME, manually narrowing a union is easy: assert with isinstance(). It's cheap, easy, and detects the failure at the point of origin. FWIW, I think Pytype has (or had? pytype devs can correct me/clarify ) some functionality that, in specific cases, usage of a Union value was considered valid as long as any of the contained types was satisfied. This was a behavior that I grew to dislike because it'd bite much later. I would think my code was type-safe, but then, months later, find some edge case months later where I forgot to do a pre-check and ended up treating an int as a str or some such. I *think* this behavior was introduced long ago for pragmatic reasons (made it easier to introduce type checking), but I also think it's been cleaned up / restricted over time as typing has grown. For float pow -- and I don't recall the exact API definitions -- isn't it well defined (I couldn't find the language def from a quick search)? i.e if any input arg is complex, then the output is complex.
In these cases, returning a Union would technically be correct, but would also be inconvenient for callers that know what type to expect. We usually give up and just return Any.
My suggestion would be to add a type tentatively called "AnyOf", that acts as an unsafe union:
ip_address(address: str) -> AnyOf[IPv4Address, IPv6Address]
I believe this could be useful for type checkers. While the use of AnyOf is unsafe, it is safer than just treating these return values as Any. It's also theoretically possible to use some clever narrowing. That said, there has been some pushback from the mypy core team in the linked issue, especially since this is most likely complex to implement. An easy way for type checkers to implement this, without sacrificing any already existing type safety is to just treating AnyOf exactly like Any. At least as a first step.
Should type checkers implement full support for AnyOf later, we could already have better types in typeshed ready to use.
But in the last years, typeshed has also gained users outside of type checkers. For example, PyCharm and jedi use it for autocompletion. I believe that AnyOf could be useful for those projects, as well as others.
Is there interest in a feature like this? I would be willing to write a PEP and contribute the AnyOf = Any solution to mypy. Unfortunately I don't have the bandwidth to learn how to implement this "properly" in mypy.
- Sebastian _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: richardlev@gmail.com