(there's a small TL;DR towards the end of my reply if you want to read in reverse to follow my thought process from possible conclusions to how i got there - please don't reply without reading the whole thing first)

TL;DR of my TL;DR - Not conveying bool-ness directly in the return annotation is my only complaint.  A BoolTypeGuard spelling would alleviate that.  I'm +0.3 now.  Otherwise I elaborate on other guarding options and note a few additional Rejected/Postponed/Deferred Ideas sections that the PEP should mention as currently out of scope: Unconditional guards, multiple guarded parameters, and type mutators.  Along the way I work my way towards suggestions for those, but I think they don't belong in _this_ PEP and could serve as input for future ones if/when desired.

On Sun, Feb 14, 2021 at 8:53 AM Paul Bryan <pbryan@anode.ca> wrote:
I'm a +1 on using Annotated in this manner. Guido mentioned that it was intended for only third-parties though. I'd like to know more about why this isn't a good pattern for use by Python libraries. 

On Sun, 2021-02-14 at 16:29 +0100, Adrian Freund wrote:
Here's another suggestion:

PEP 593 introduced the `Annotated` type annotation. This could be used to annotate a TypeGuard like this:

`def is_str_list(val: List[object]) -> Annotated[bool, TypeGuard(List[str])]`

I like Annotated better than not having it for the sake of not losing the return type.  BUT I still feel like this limits things too much and disconnects the information about what parameter(s) are transformed. It also doesn't solve the problem of why the guard _must_ be tied to the return value. Clearly sometimes it is desirable to do that. But on many other scenarios the act of not raising an exception is the narrowing action: ie - it should be declared as always happening.  Nothing in the above annotation reads explicitly to me as saying that the return value determines the type outcome.
 

Note that I used ( ) instead of [ ] for the TypeGuard, as it is no longer a type.

This should fulfill all four requirements, but is a lot more verbose and therefore also longer.
It would also be extensible for other annotations.

For the most extensible approach both `-> TypeGuard(...)` and `-> Annotated[bool, TypeGuard(...)]` could be allowed, which would open the path for future non-type-annotations, which could be used regardless of whether the code is type-annotated.


--
Adrian

On February 14, 2021 2:20:14 PM GMT+01:00, Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Feb 13, 2021 at 07:48:10PM -0000, Eric Traut wrote:

I think it's a reasonable criticism that it's not obvious that a 
function annotated with a return type of `TypeGuard[x]` should return
a bool.

[...]
As Guido said, it's something that a developer can easily 
look up if they are confused about what it means.


Yes, developers can use Bing and Google :-)

But it's not the fact that people have to look it up. It's the fact that
they need to know that this return annotation is not what it seems, but
a special magic value that needs to be looked up.

That's my objection: we're overloading the return annotation to be
something other than the return annotation, but only for this one
special value. (So far.) If you don't already know that it is special,
you won't know that you need to look it up to learn that its special.


 I'm open to alternative formulations that meet the following requirements:

1. It must be possible to express the type guard within the function
signature. In other words, the implementation should not need to be
present. This is important for compatibility with type stubs and to
guarantee consistent behaviors between type checkers.


When you say "implementation", do you mean the body of the function?

Why is this a hard requirement? Stub files can contain function
bodies, usually `...` by convention, but alternatives are often useful,
such as docstrings, `raise NotImplementedError()` etc.

https://mypy.readthedocs.io/en/stable/stubs.html

I don't think that the need to support stub files implies that the type
guard must be in the function signature. Have I missed something?


2. It must be possible to annotate the input parameter types _and_ the 
resulting (narrowed) type. It's not sufficient to annotate just one or
the other.


Naturally :-)

That's the whole point of a type guard, I agree that this is a truly
hard requirement.


3. It must be possible for a type checker to determine when narrowing 
can be applied and when it cannot. This implies the need for a bool
response.


Do you mean a bool return type? Sorry Eric, sometimes the terminology
you use is not familiar to me and I have to guess what you mean.


4. It should not require changes to the grammar because that would 
prevent this from being adopted in most code bases for many years.


Fair enough.


Mark, none of your suggestions meet these requirements.


Mark's suggestion to use a variable annotation in the body meets
requirements 2, 3, and 4. As I state above, I don't think that
requirement 1 needs to be a genuinely hard requirement: stub files can
include function bodies.

To be technically precise, stub functions **must** include function
bodies. It's just that by convention we use typically use `...` as the
body.


Gregory, one of your suggestions meets these requirements:

```python
def is_str_list(val: Constrains[List[object]:List[str]) -> bool:
...
```


That still misleadingly tells the reader (or naive code analysis
software) that parameter val is of type

Contrains[List[object]:List[str]]

whatever that object is, rather than what it *actually* is, namely
`List[object]`. I dislike code that misleads the reader.

Something like this wouldn't mislead the reader:

def is_strs(items: List[Any] narrows to List[str]) -> bool:
    ...

But that's unlikely desirable from an added syntax perspective.

To use an existing token as I suggested in the text of my earlier reply:

def is_strs(items: List[Any] -> List[str]) -> bool:
    ...

again, requires parsing updates but we're not introducing new soft keywords.

def is_strs(items: Narrows[List[Any] -> List[str]]) -> bool:
    ...

Even if those are written using existing syntax parsable today such as
  Narrows[List[Any], List[str]]
  Constrains[List[Any]:List[str]]
  Narrows(List[Any], List[str])
  ConstrainsTypeTo(List[Any], List[str])
  NarrowsTypeTo(List[Any], List[str])

Words like Constrains and Narrows are intentionally verbs.  TypeGuard is a noun.  A verb conveys an action so it is reasonable to read those as not being a type themselves but indicating that something is happening to the types.  Adrian's suggestion of using () instead of [] on TypeGuard to help convey the action.  This could also be used here (as in a couple of those examples above) to make it even more obvious to the reader that Narrows, Constrains, NarrowsTo, ConstrainsTo, ConstrainsTypeTo, (whatever verb name we choose)... is not a type itself but a descriptor of the action being taken on this type.

def is_strs(items: Constrains(List[str], List[int])) -> bool:
    ...

The return type is not consumed, hidden, or altered by these proposals.  The specific parameter(s) being acted upon are annotated with the action verb.

This allows for more than just trivial signature single parameter bool return functions to be used as type guards.

def validate_and_count(
    identifiers: ConstrainsTypeTo(List[Any], List[str]),
    results: ConstrainsTypeTo(List[Any], List[Optional[float]])
) -> int:
    """Validates all identifiers and results. Returns the number of non-None values."""
 
Clearly a made up example for illustration, but my point is more a question of why restrict this feature to mere `f(a)->bool` functions?

We could take this further and instead of merely offering a type guard that constrains a type to a narrower definition, offer the ability to annotate functions that mutate the type of their mutable parameters. Today that isn't supported at all, we require a return value for such an action. But with this form of annotation one can easily imagine:

def sanitize_dates(dates: ConvertsTypeTo(List[str], List[datetime]) -> Set[str]:
    """Converts the date strings to datetime objects in place, extracts and returns any unparsable ones."""

Whether you think that function is good API designs or not: Functions that do this sort of thing exist. Indicating things on the parameter gives the ability for them to contribute to static analysis by conveying their action.

I'm still not sure how to convey when a type guarding, narrowing, constraining, or converting action is tied to the return value of the function or not. When indicating things solely on the arguments themselves, that seems to be more likely to read as an un-tied non return value based constraint.  Perhaps that's what we want.  We'd end up with this for the always unless raises case, and an annotation on the return value for the limited type guard "def f(a)->bool" situation.

TL;DR Does that take us full circle to:

def is_str_list(value: List[Any]) -> BoolTypeGuard[List[str]]:
    ...

for the PEP 647 being proposed that only covers f(x)->bool cases?  (note my simple addition of Bool to the name - the unaware reader can see that and assume "eh, it's a boolean")

and a potential follow-on PEP for more complicated situations of constraining or mutating the types of potentially multiple parameters regardless of return value as I've described above?

I do like how conservative the PEP 647 is in what it supports despite anything I complain about above.  The existing "User-defined type guards apply narrowing only in the positive case (the if clause). The type is not narrowed in the negative case." text... is +1 for a first implementation.  It's more a matter of spelling and readability for me, while opening my mind to the idea of other type transformations that we don't have a way to communicate that I've tried to brainstorm on here.

Capturing the scope and potential directions to explore for the future in the PEP would be good.  You've got a little bit of that in the Conditionally Applying TypeGuard Type and Narrowing Arbitrary Parameters sections.  Perhaps add Rejected/Postponed ideas sections: One for "Unconditional TypeGuard" to cover functions that raise if the guarding logic fails?  And another for "Multiple Guarded Parameters" and maybe one more for "Mutating Type Conversions"?

I'm not the one driving the need for this to be implemented.  I'm just aiming to ensure we end up with a readable result rather than something for readers eyes to glaze over on. Or feel like the language is gatekeeping by making readers feel like impostors if they don't understand type theory.

+0.3 I'm coming around on the narrow (guarded?) existing PEP, it's mostly a matter of spelling to convey the boolean return value.

-gps