Callable syntax: Unions in return position
In a different typing-sig thread [1] about deprecating typing.Union, a question came up about `|` vs `->` for callable types Since it's really a PEP677 question I thought I'd start a separate thread. Currently PEP 677 says that `|` binds tighter, so that ``` (int, str) -> bool | None ``` is a callable returning an optional `bool`, rather than an optional callable returning a `bool. Our reasons for having `|` bind tighter have to do with consistency: - `|` already binds more tightly in function signatures - e.g. `def f() -> int | None: ...` returns an optional int, with no parentheses needed - existing languages like TypeScript with both `|` and arrow types do this Another consideration, besides consistency, is to look at these questions is by looking at the most common use case. Pradeep did some analysis and it actually turns out that an optional callable is more common than returning an optional or union: - 330 functions take an Optional[Callable] argument - This is super common mainly because callbacks often modify "default" behavior, and using None as the default is a standard idiom - Only 12 return unions and 25 return optionals As a result, we have a dilemma in that consistency points one way (`|` should bind tighter) and use case metrics the other way. My personal views are that: - in this case, consistency is more important even if metrics point the other way - in particular it seems unacceptable to have `|` bind more tightly in signatures / less tightly in callable types - the problem is easily handled in many codebases with linters and autoformatters - we could and maybe should insert language explicitly encouraging parentheses if we think this will become a major problem - I'm also not *too* worried about user experience in any case because it will get caught statically: - any errors from mixing up precedence will happen in type checking rather than at runtime, and often display in-editor - as a result, it seems to me like most users will shrug and add parentheses without much confusion - and I think it would be very rare for a precedence mixup to go undetected and cause an application to fail But we'd like to get a chance to hear your thoughts before we decide whether / how to best address this in the PEP. ------ [1] typing.Union deprecation theread https://mail.python.org/archives/list/typing-sig@python.org/thread/TW5M6XDN7...
Adding some more details to give context. Here's a gist illustrating relevant types from typeshed: https://gist.github.com/pradeep90/a37b2c9a339aebdec22ba7310a25117b And here is Pradeep's example illustrating why `|` precedence might be confusing, moved into a gist to get syntax highlighting: https://gist.github.com/stroxler/01fafd28cede73a351c51a61e76703b0
Following up on this thread: after some more thought and discussion, I'm leaning toward changing the spec so that `->` binds tighter than `|`. There's a big downside to this: it feels bad to have `def f(x: int) -> float | None: ...` indicate a function returning an optional, whereas `(int) -> float | None` indicates an optional callable. But there are two very good reasons why that may still be the better choice. (1) Because arrow is asymmetric, not a true binary operator, having it bind more loosely produces a SyntaxError on types like `int | (str) -> bool` This seems like undesirable behavior as a consequence of our precedence choices, because SyntaxErrors are unusually disruptive to users because they typically break most tools from running on a module. It would be *possible* to special case this so that `|` binds tighter on the right and looser on the left, but having the binding rules be order-dependent seems out of the question for practical purposes. (2) Pradeep has stats that convincingly show that Optional[Callable] is at least 5-10x more common than returning an Optional or Union type - something like a quarter of all callback arguments are optional! So weighted by use case, having `|` bind looser will almost certainly be more convenient day-to-day. --- To come at this from a "user stories" point of view, I want to compare what happens if we make `|` bind tighter versus looser and a user gets it wrong: If we make `|` bind looser and a user incorrectly expects `(int) -> str | None` to return an `Optional[str]`, they'll get type check errors that clearly say their type literal was interpreted as an optional callable, which I think will be a very friendly and self-descriptive indication that it's a binding rules issue. On the other hand if we make `|` bind tighter and a user tries to use the type `(int) -> bool | (float) -> bool` they'll get a SyntaxError and no further information from any tool running on the module. With some effort we can make the SyntaxError friendly in CPython, but even if we do that, other tools relying on the same grammer (like typecheckers using their own parsers) will tend to produce minimally useful error messages. To me, that first "confused user" story is a lot better. When I also account for the fact that Pradeep's stats suggest there will be 5-10x more confused users if we have `|` bind tighter, this makes me lean toward looser binding. --- I'm expecting there may be disagreement here, since there are some big cons to either choice regarding precedence. I'll probably put a PR out on the PEP (and update my reference implementation) this week, but happy to hear other folks' thoughts first, I had serious doubts until I thought hard about the user stories.
+0.5, as I follow your reasoning and agree, but I don't yet appreciate where the dragons may lie in wait. A thought that I'm fully prepared to be shot down: what if callable syntax were (...) -> (...) with mandatory parens? On Mon, 2021-12-20 at 21:12 +0000, Steven Troxler wrote:
Following up on this thread: after some more thought and discussion, I'm leaning toward changing the spec so that `->` binds tighter than `|`.
There's a big downside to this: it feels bad to have `def f(x: int) -
float | None: ...` indicate a function returning an optional, whereas `(int) -> float | None` indicates an optional callable.
But there are two very good reasons why that may still be the better choice.
(1) Because arrow is asymmetric, not a true binary operator, having it bind more loosely produces a SyntaxError on types like `int | (str) -> bool`
This seems like undesirable behavior as a consequence of our precedence choices, because SyntaxErrors are unusually disruptive to users because they typically break most tools from running on a module.
It would be *possible* to special case this so that `|` binds tighter on the right and looser on the left, but having the binding rules be order-dependent seems out of the question for practical purposes.
(2) Pradeep has stats that convincingly show that Optional[Callable] is at least 5-10x more common than returning an Optional or Union type - something like a quarter of all callback arguments are optional!
So weighted by use case, having `|` bind looser will almost certainly be more convenient day-to-day.
---
To come at this from a "user stories" point of view, I want to compare what happens if we make `|` bind tighter versus looser and a user gets it wrong:
If we make `|` bind looser and a user incorrectly expects `(int) -> str | None` to return an `Optional[str]`, they'll get type check errors that clearly say their type literal was interpreted as an optional callable, which I think will be a very friendly and self- descriptive indication that it's a binding rules issue.
On the other hand if we make `|` bind tighter and a user tries to use the type `(int) -> bool | (float) -> bool` they'll get a SyntaxError and no further information from any tool running on the module. With some effort we can make the SyntaxError friendly in CPython, but even if we do that, other tools relying on the same grammer (like typecheckers using their own parsers) will tend to produce minimally useful error messages.
To me, that first "confused user" story is a lot better. When I also account for the fact that Pradeep's stats suggest there will be 5-10x more confused users if we have `|` bind tighter, this makes me lean toward looser binding.
---
I'm expecting there may be disagreement here, since there are some big cons to either choice regarding precedence. I'll probably put a PR out on the PEP (and update my reference implementation) this week, but happy to hear other folks' thoughts first, I had serious doubts until I thought hard about the user stories. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: pbryan@anode.ca
what if callable syntax were (...) -> (...) with mandatory parens?
I don't think this will work well, just because then the common, easy cases stop looking anything like function annotations, for example (int) -> bool feels very natural whereas (int) -> (bool) somehow doesn't. If we were going to do something that diverges more wildly from the current proposal I'd actually favor `(int, str -> bool)` instead, i.e. he rejected "parentheses-less" approach with mandatory wrapping parentheses. To be honest, (int, str -> bool) looks weird but seems to solve most of the nasty issues, which all come from `->` being a pseudo-operator: - unions and callables-returning-callables become no problem at all - it also would make callables in return position more obvious, which is a big concern raised in python-dev But my guess is that this would get a lukewarm reception with typing folks because it lacks visual similarity to existing python idioms. I'm curious what people think.
We have some other examples of syntax that binds tighter or less tight depending on context: x, y = z # ',' binds tighter than '=' f(x, y=z) # '=' binds tighter than ',' This doesn't confuse people in practice, so I'm okay if we end up going this way (though right now I think I'm -0; I already worry that this PEP is weighing "what do we see in existing code" too strongly over other language design principles, which are about surprises. Perhaps we need to disallow unparenthesized callable types in the return type for a function definition? I.e. def foo(x: int) -> () -> int: # Invalid return lambda: x def foo(x: int) -> (() -> int): # okay return lambda: x (Separately, there seems to be a lot of opposition to this PEP on python-dev. If that's not dealt with to the SC's satisfaction, it may well be moot how we decide the relative priority of '|' and '->' -- the PEP may be dead in the water.) On Mon, Dec 20, 2021 at 1:13 PM Steven Troxler <steven.troxler@gmail.com> wrote:
Following up on this thread: after some more thought and discussion, I'm leaning toward changing the spec so that `->` binds tighter than `|`.
There's a big downside to this: it feels bad to have `def f(x: int) -> float | None: ...` indicate a function returning an optional, whereas `(int) -> float | None` indicates an optional callable.
But there are two very good reasons why that may still be the better choice.
(1) Because arrow is asymmetric, not a true binary operator, having it bind more loosely produces a SyntaxError on types like `int | (str) -> bool`
This seems like undesirable behavior as a consequence of our precedence choices, because SyntaxErrors are unusually disruptive to users because they typically break most tools from running on a module.
It would be *possible* to special case this so that `|` binds tighter on the right and looser on the left, but having the binding rules be order-dependent seems out of the question for practical purposes.
(2) Pradeep has stats that convincingly show that Optional[Callable] is at least 5-10x more common than returning an Optional or Union type - something like a quarter of all callback arguments are optional!
So weighted by use case, having `|` bind looser will almost certainly be more convenient day-to-day.
---
To come at this from a "user stories" point of view, I want to compare what happens if we make `|` bind tighter versus looser and a user gets it wrong:
If we make `|` bind looser and a user incorrectly expects `(int) -> str | None` to return an `Optional[str]`, they'll get type check errors that clearly say their type literal was interpreted as an optional callable, which I think will be a very friendly and self-descriptive indication that it's a binding rules issue.
On the other hand if we make `|` bind tighter and a user tries to use the type `(int) -> bool | (float) -> bool` they'll get a SyntaxError and no further information from any tool running on the module. With some effort we can make the SyntaxError friendly in CPython, but even if we do that, other tools relying on the same grammer (like typecheckers using their own parsers) will tend to produce minimally useful error messages.
To me, that first "confused user" story is a lot better. When I also account for the fact that Pradeep's stats suggest there will be 5-10x more confused users if we have `|` bind tighter, this makes me lean toward looser binding.
---
I'm expecting there may be disagreement here, since there are some big cons to either choice regarding precedence. I'll probably put a PR out on the PEP (and update my reference implementation) this week, but happy to hear other folks' thoughts first, I had serious doubts until I thought hard about the user stories. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: guido@python.org
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Mon, Dec 20, 2021 at 01:32:45PM -0800, Guido van Rossum wrote:
Perhaps we need to disallow unparenthesized callable types in the return type for a function definition? I.e.
def foo(x: int) -> () -> int: # Invalid return lambda: x
def foo(x: int) -> (() -> int): # okay return lambda: x
+1 I think that's a conservative, lightweight restriction that helps the reader, and that can be relaxed in a few years when people get used to the syntax. -- Steve
I don't really like the idea of requiring surrounding only in return position - stated as such I think it would be hard to write a grammar that does this, and I'm convinced we hit some even uglier edge cases. But the more I look at it the more I favor requiring parentheses *all* the time. I wrote up my thoughts on this (there are several different concerns all pointing me in this direction) in https://gist.github.com/stroxler/321af865614aa8e04781a29c358f28ae
participants (4)
-
Guido van Rossum
-
Paul Bryan
-
Steven D'Aprano
-
Steven Troxler