Re: [Python-Dev] PEP 544: Protocols - second round

30 May 2017

      [I added some blank lines to separate the PEP quotes from Kevin's
responses.]

On Mon, May 29, 2017 at 7:51 AM, Kevin Conway <kevinjacobconway@gmail.com>
wrote:
...
From the PEP:
...
The problem with them is that a class has to be explicitly marked to
support them, which is unpythonic and unlike what one would normally do in
idiomatic dynamically typed Python code.
The same problem appears with user-defined ABCs: they must be explicitly
subclassed or registered.
Neither of these statements are entirely true. The semantics of `abc`
allow for exactly the kind of detached interfaces this PEP is attempting to
provide. The `abc.ABCMeta` provides the `__subclasshook__` which allows a
developer to override the default check of internal `abc` registry state
with virtually any logic that determines the relationship of a class with
the interface. The prior art I linked to earlier in the thread uses this
feature to generically support `issubclass` and `isinstance` in such a way
that the PEPs goal is achieved.
But that doesn't help a static type checker. You can't expect a static
checker to understand the implementation of a particular
`__subclasshook__`. In practice, except for a few "one-trick ponies" such
as Hashable, existing ABCs rely on subclassing or registration to make
isinstance() work, and for statically checking code that uses duck typing
those aren't enough.
...
...
The intention of this PEP is to solve all these problems by allowing
users to write the above code without explicit base classes in the class
definition
As I understand this goal, you want to take what some of us in the
community have been building ourselves and make it canonical via the
stdlib.
Not really. The goal is to suggest implementation a frequently requested
feature in static checkers based on the type system laid out in PEP 484,
namely checks for duck types. To support this the type system from PEP 484
needs to be extended, and that's what PEP 544 is about.
...
What strikes me as odd is that the focus is on 3rd party type checkers
first rather than introducing this as a feature of the language runtime and
then updating the type checker contract to make use of it.
This seems to be a misunderstanding, or at least an attempt to change the
agenda for PEP 544. The primary goal of the PEP is not to support runtime
checking but static checking. This is not new -- PEP 484 and PEP 526 before
it have also focused on features that are useful primarily for static
checkers. (Also, a bit of history: PEP 484 intentionally focused on static
checking support because there was widespread skepticism about the need for
more runtime checking, but there was a subset of the community that was
very interested in static checking.)
...
I see a mention of the `isinstance` check support in the
postponed/rejected ideas, but the only rationale given for it being in that
category is, generally, "there are edge cases". For example, the PEP lists
this as an edge case:
...
The problem with this is instance checks could be unreliable, except for
situations where there is a common signature convention such as Iterable
However, the sample given demonstrates precisely the expected behavior of
checking if a concrete implements the protocol. It's unclear why this
sample is given as a negative.
I assume we're talking about this example:

  class P(Protocol):
      def common_method_name(self, x: int) -> int: ...

  class X:
      <a bunch of methods>
      def common_method_name(self) -> None: ... # Note different signature

  def do_stuff(o: Union[P, X]) -> int:
      if isinstance(o, P):
          return o.common_method_name(1)  # oops, what if it's an X
instance?

The problem there is that the "state of the art" for runtiming checking
isinstance(o, P) boils down to hasattr(o, 'common_method_name') while the
type checker takes the method signatures into account, so it will consider
X objects not to be instances of P.

The other case given is:
...
...
Another potentially problematic case is assignment of attributes after
instantiation
Can you elaborate on how type checkers would not encounter this same
issue? If there is a solution to this problem for type checkers, would that
same solution not work at runtime? Also, it seems odd to use a custom
initialize function rather than `__init__`. I don't think it was
intentional, but this makes it seem like a bit of a strawman that doesn't
represent typical Python code.
Lots of code I've seen initializes variables in a separate function
(usually called from `__init__`). Mypy, at least, considers instance
variables assigned through `self` in all methods of a class to be potential
instance variable declarations, otherwise a lot of code could not be
type-checked.

Again, the example is problematic given that the runtime check for
isinstance(c, P) can't do better than hasattr(c, 'x').  (I think there's a
typo in the PEP here, 'c1' should be 'c'.)

The need to use an explicit class decorator to add isinstance support is
used as a way to encourage developers to think about whether the runtime
instance check will match the picture as seen by the static checker, before
they turn on this decorator. It seems reasonable to me.
...
Also, extensive use of ABCs might impose additional runtime costs.
I'd love to see some data around this. Given that it's a rationale for the
PEP I'd expect to see some numbers behind it. For example, is memory cost
of directly registering implementations to abc linear or worse? What is the
runtime growth pattern of isinstance or issubclass when used with heavily
registered or deeply registered abc graphs and is it different than those
calls on concrete class hierarchies? Does the cost affect anything more
than the initial evaluation of the code or, in the absence of
isinstance/issubclass checks, does it continue to have an impact on the
runtime?
Its commonly known that ABCs are expensive (though if you want to do
precise measurements you're welcome). Here's one data point:
https://github.com/python/mypy/commit/1be4db7ac6e06a162355c3d5f7794d21b89a10...
-- it's a one-line diff that removes `metaclass=ABCMeta` from one class,
and the commit message reads:

Make the AST classes not ABCs

This results in roughly a 20% speedup on the non-parsing steps.
Here are the timings I got from running mypy on itself:

Before the change:
3861.8ms (49.0%) SemanticallyAnalyzedFile
2760.3ms (35.0%) UnprocessedFile
1111.8ms (14.1%) ParsedFile
 142.8ms ( 1.8%) PartiallySemanticallyAnalyzedFile

After the change:
3086.1ms (45.1%) SemanticallyAnalyzedFile
2665.1ms (39.0%) UnprocessedFile
 945.1ms (13.8%) ParsedFile
 139.6ms ( 2.0%) PartiallySemanticallyAnalyzedFile

-- 
--Guido van Rossum (python.org/~guido <http://python.org/%7Eguido>)