On Aug 29, 2019, at 16:03, Dominik Vilsmeier <dominik.vilsmeier@gmx.de> wrote:
I never really understood the importance of `Optional`. Often it can be left out altogether and in other cases I find `Union[T, None]` more expressive (explicit) than `Optional[T]` (+ the latter saves only 3 chars).
Especially for people not familiar with typing, the meaning of `Optional` is not obvious at first sight. `Union[T, None]` on the other hand is pretty clear. Also in other cases, where the default (fallback) is different from `None`, you'd have to use `Union` anyway. For example a function that normally returns an object of type `T` but in some circumstances it cannot and then it returns the reason as a `str`, i.e. `-> Union[T, str]`; `Optional` won't help here.
But this should be very rare. Most functions that can return a fallback value return a fallback value of the expected return type. For example, a get(key, default) method will return the default param, and the caller should pass in a default value of the type they’re expecting to look up. So, this shouldn’t be get(key: KeyType, default: T) -> Union[ValueType, T], it should be get(key: KeyType, default: ValueType) -> ValueType. Or maybe get(key: KeyType, default: Optional[ValueType]=None) -> Optional[ValueType]. Most functions that want to explain why they failed do so by raising an exception, not by returning a string. And what other cases are there? Of course you could be trying to add type checking to some weird legacy codebase that doesn’t do things Pythonically, so you have to use Union returns. But that’s specific to that one weird codebase. Meanwhile, Optional return values are common all over Python. Also, Python’s typing system is a lot easier to grasp if you’re familiar with an established modern-typed language (Swift, Scala, Haskell, F#, etc.), and they also use Optional[T] (or optional<T> or Maybe t or some other spelling of the same idea) all over be place—so often that many of them have added shortcuts like T? to make it easier to write and less intrusive to read. I think there may be a gap in the docs. They make perfect sense to someone with experience in one of those languages, but a team that has nobody with that experience might be a little lost. There’s a mile-high overview, a theory paper, and then basically just reference docs that expect you to already know all the key concepts that you don’t already know. Maybe that’s something that an outsider who’s trying to learn from the docs plus trial and error could help improve?
Scanning through the docs and PEP I can't find strongly motivating examples for `Optional` (over `Union[T, None]`). E.g. in the following:
def lookup(self, name: str) -> Optional[Node]: nodes = self.get(name) if nodes: return nodes[-1] return None
I would rather write `Union[Node, None]` because that's much more explicit about what happens.
Then introducing `~T` in place of `Optional[T]` just further obfuscates the meaning of the code:
def lookup(self, name: str) -> ~Node:
The `~` is easy to be missed (at least by human readers) and the meaning not obvious.
That’s kind of funny, because I had to read your Union[Node, None] a couple times before I realized you hadn’t written Union[Node, Node]. :) I do dislike ~ for other reasons (but I already mentioned them, Guido isn’t convinced, so… fine, I don’t hate it that much). But I don’t think ~ is easy to miss. It’s not like a period or backtick that can be mistaken for grit on your screen; it’s more visible than things like - that everyone expects to be able to pick out.
For `Union` on the other hand it would be more helpful to have a shorter syntax, `int | str` seems pretty clear, but what prevents tuples `(int, str)` from being interpreted as unions by type checkers. This doesn't require any changes to the built-in types and it is aligned with the already existing syntax for checking multiple types with `isinstance` or `issubclass`: `isinstance(x, (int, str))`. Having used this a couple of times, whenever I see a tuple of types I immediately think of them as `or` options.
The biggest problem with tuple is that in every other language with a similar type system, (int, str) means Tuple[int, str]. I think {int, str}, which someone proposed in one of the earlier discussions, is nice. What else would a set of types mean (unless you’re doing mathematical type theory rather than programming language typing)? But it’s unfortunate that things like isinstance and except take a tuple of types (and it has to be a tuple, not any other kind of iterable), so a set might be just as confusing for hardcore Python types as a tuple would be for polyglots. If the compatibility issue isn’t a big deal (and I trust Guido that is isn’t), I think int | str is the best option. It’a an operator that means union, it’s used for sum/union types in other languages, it makes perfect sense if you read it as “int or str”… I cant imagine anyone being confused or put off by it.