[Python-Dev] Re: PEP 647 (type guards) -- final call for comments

14 Feb 2021

      (there's a small TL;DR towards the end of my reply if you want to read in
reverse to follow my thought process from possible conclusions to how i got
there - please don't reply without reading the whole thing first)

*TL;DR of my TL;DR* - Not conveying bool-ness directly in the return
annotation is my only complaint.  A BoolTypeGuard spelling would alleviate
that.  I'm +0.3 now.  Otherwise I elaborate on other guarding options and
note a few additional Rejected/Postponed/Deferred Ideas sections that the
PEP should mention as currently out of scope: Unconditional guards,
multiple guarded parameters, and type mutators.  Along the way I work my
way towards suggestions for those, but I think they don't belong in _this_
PEP and could serve as input for future ones if/when desired.

On Sun, Feb 14, 2021 at 8:53 AM Paul Bryan  wrote:
...
I'm a +1 on using Annotated in this manner. Guido mentioned that it was
intended for only third-parties though. I'd like to know more about why
this isn't a good pattern for use by Python libraries.
On Sun, 2021-02-14 at 16:29 +0100, Adrian Freund wrote:
Here's another suggestion:
PEP 593 introduced the `Annotated` type annotation. This could be used to
annotate a TypeGuard like this:
`def is_str_list(val: List[object]) -> Annotated[bool,
TypeGuard(List[str])]`
I like Annotated better than not having it for the sake of not losing the
return type.  BUT I still feel like this limits things too much and
disconnects the information about what parameter(s) are transformed. It
also doesn't solve the problem of why the guard _must_ be tied to the
return value. Clearly sometimes it is desirable to do that. But on many
other scenarios the act of not raising an exception is the narrowing
action: ie - it should be declared as always happening.  Nothing in the
above annotation reads explicitly to me as saying that the return value
determines the type outcome.
...
Note that I used ( ) instead of [ ] for the TypeGuard, as it is no longer
a type.
This should fulfill all four requirements, but is a lot more verbose and
therefore also longer.
It would also be extensible for other annotations.
For the most extensible approach both `-> TypeGuard(...)` and `->
Annotated[bool, TypeGuard(...)]` could be allowed, which would open the
path for future non-type-annotations, which could be used regardless of
whether the code is type-annotated.
--
Adrian
On February 14, 2021 2:20:14 PM GMT+01:00, Steven D'Aprano <
steve@pearwood.info> wrote:
On Sat, Feb 13, 2021 at 07:48:10PM -0000, Eric Traut wrote:
I think it's a reasonable criticism that it's not obvious that a
function annotated with a return type of `TypeGuard[x]` should return
a bool.
[...]
As Guido said, it's something that a developer can easily
look up if they are confused about what it means.
Yes, developers can use Bing and Google :-)
But it's not the fact that people have to look it up. It's the fact that
they need to know that this return annotation is not what it seems, but
a special magic value that needs to be looked up.
That's my objection: we're overloading the return annotation to be
something other than the return annotation, but only for this one
special value. (So far.) If you don't already know that it is special,
you won't know that you need to look it up to learn that its special.
I'm open to alternative formulations that meet the following requirements:
1. It must be possible to express the type guard within the function
 signature. In other words, the implementation should not need to be
 present. This is important for compatibility with type stubs and to
 guarantee consistent behaviors between type checkers.
When you say "implementation", do you mean the body of the function?
Why is this a hard requirement? Stub files can contain function
bodies, usually `...` by convention, but alternatives are often useful,
such as docstrings, `raise NotImplementedError()` etc.
https://mypy.readthedocs.io/en/stable/stubs.html
I don't think that the need to support stub files implies that the type
guard must be in the function signature. Have I missed something?
2. It must be possible to annotate the input parameter types _and_ the
resulting (narrowed) type. It's not sufficient to annotate just one or
the other.
Naturally :-)
That's the whole point of a type guard, I agree that this is a truly
hard requirement.
3. It must be possible for a type checker to determine when narrowing
can be applied and when it cannot. This implies the need for a bool
response.
Do you mean a bool return type? Sorry Eric, sometimes the terminology
you use is not familiar to me and I have to guess what you mean.
4. It should not require changes to the grammar because that would
prevent this from being adopted in most code bases for many years.
Fair enough.
Mark, none of your suggestions meet these requirements.
Mark's suggestion to use a variable annotation in the body meets
requirements 2, 3, and 4. As I state above, I don't think that
requirement 1 needs to be a genuinely hard requirement: stub files can
include function bodies.
To be technically precise, stub functions **must** include function
bodies. It's just that by convention we use typically use `...` as the
body.
Gregory, one of your suggestions meets these requirements:
```python
def is_str_list(val: Constrains[List[object]:List[str]) -> bool:
    ...
```
That still misleadingly tells the reader (or naive code analysis
software) that parameter val is of type
Contrains[List[object]:List[str]]
whatever that object is, rather than what it *actually* is, namely
`List[object]`. I dislike code that misleads the reader.
Something like this wouldn't mislead the reader:

def is_strs(items: List[Any] narrows to List[str]) -> bool:
    ...

But that's unlikely desirable from an added syntax perspective.

To use an existing token as I suggested in the text of my earlier reply:

def is_strs(items: List[Any] -> List[str]) -> bool:
    ...

again, requires parsing updates but we're not introducing new soft keywords.

def is_strs(items: Narrows[List[Any] -> List[str]]) -> bool:
    ...

Even if those are written using existing syntax parsable today such as
  Narrows[List[Any]*, *List[str]]
  Constrains[List[Any]*:*List[str]]
  Narrows*(*List[Any], List[str]*)*
  Constrains*TypeTo(*List[Any], List[str])
  NarrowsTypeTo(List[Any], List[str])

Words like *Constrains* and *Narrows* are intentionally *verbs*.  TypeGuard
is a noun.  A verb conveys an action so it is reasonable to read those as
not being a type themselves but indicating that something is happening to
the types.  Adrian's suggestion of using () instead of [] on TypeGuard to
help convey the action.  This could also be used here (as in a couple of
those examples above) to make it even more obvious to the reader that
Narrows, Constrains, NarrowsTo, ConstrainsTo, ConstrainsTypeTo, (whatever
verb name we choose)... is not a type itself but a descriptor of the action
being taken on this type.

def is_strs(items: Constrains(List[str], List[int])) -> bool:
    ...

The return type is not consumed, hidden, or altered by these proposals.
The specific parameter(s) being acted upon are annotated with the action
verb.

This allows for more than just trivial signature single parameter bool
return functions to be used as type guards.

def validate_and_count(
    identifiers: ConstrainsTypeTo(List[Any], List[str]),
    results: ConstrainsTypeTo(List[Any], List[Optional[float]])
) -> int:
    """Validates all identifiers and results. Returns the number of
non-None values."""

Clearly a made up example for illustration, but my point is more a question
of why restrict this feature to mere `f(a)->bool` functions?

We could take this further and instead of merely offering a type guard that
constrains a type to a narrower definition, offer the ability to annotate
functions that mutate the type of their mutable parameters. Today that
isn't supported at all, we require a return value for such an action. But
with this form of annotation one can easily imagine:

def sanitize_dates(dates: ConvertsTypeTo(List[str], List[datetime]) ->
Set[str]:
    """Converts the date strings to datetime objects in place, extracts and
returns any unparsable ones."""

Whether you think that function is good API designs or not: Functions that
do this sort of thing exist. Indicating things on the parameter gives the
ability for them to contribute to static analysis by conveying their action.

I'm still not sure how to convey when a type guarding, narrowing,
constraining, or converting action is tied to the return value of the
function or not. When indicating things solely on the arguments themselves,
that seems to be more likely to read as an un-tied non return value based
constraint.  Perhaps that's what we want.  We'd end up with this for the
always unless raises case, and an annotation on the return value for the
limited type guard "def f(a)->bool" situation.

*TL;DR* Does that take us full circle to:

def is_str_list(value: List[Any]) -> BoolTypeGuard[List[str]]:
    ...

for the PEP 647 being proposed that *only* covers f(x)->bool cases?  *(note
my simple addition of Bool to the name - the unaware reader can see that
and assume "eh, it's a boolean")*

and a potential follow-on PEP for more complicated situations of
constraining or mutating the types of potentially multiple parameters
regardless of return value as I've described above?

I *do like* how conservative the PEP 647 is in what it supports despite
anything I complain about above.  The existing "User-defined type guards
apply narrowing only in the positive case (the if clause). The type is not
narrowed in the negative case." text... is +1 for a first implementation.
It's more a matter of spelling and readability for me, while opening my
mind to the idea of other type transformations that we don't have a way to
communicate that I've tried to brainstorm on here.

Capturing the scope and potential directions to explore for the future in
the PEP would be good.  You've got a little bit of that in the
Conditionally Applying TypeGuard Type and Narrowing Arbitrary Parameters
sections.  Perhaps add Rejected/Postponed ideas sections: One for
"Unconditional TypeGuard" to cover functions that raise if the guarding
logic fails?  And another for "Multiple Guarded Parameters" and maybe one
more for "Mutating Type Conversions"?

I'm not the one driving the need for this to be implemented.  I'm just
aiming to ensure we end up with a readable result rather than something for
readers eyes to glaze over on. Or feel like the language is gatekeeping by
making readers feel like impostors if they don't understand type theory.

+0.3 I'm coming around on the narrow (guarded?) existing PEP, it's mostly a
matter of spelling to convey the boolean return value.

-gps

[Python-Dev] Re: PEP 647 (type guards) -- final call for comments

Gregory P. Smith