[Python-Dev] PEP 544: Protocols - second round

Tue May 30 18:58:05 EDT 2017

[I added some blank lines to separate the PEP quotes from Kevin's
responses.]

On Mon, May 29, 2017 at 7:51 AM, Kevin Conway <kevinjacobconway at gmail.com>
wrote:

> From the PEP:
> > The problem with them is that a class has to be explicitly marked to
> support them, which is unpythonic and unlike what one would normally do in
> idiomatic dynamically typed Python code.
> > The same problem appears with user-defined ABCs: they must be explicitly
> subclassed or registered.
>
> Neither of these statements are entirely true. The semantics of `abc`
> allow for exactly the kind of detached interfaces this PEP is attempting to
> provide. The `abc.ABCMeta` provides the `__subclasshook__` which allows a
> developer to override the default check of internal `abc` registry state
> with virtually any logic that determines the relationship of a class with
> the interface. The prior art I linked to earlier in the thread uses this
> feature to generically support `issubclass` and `isinstance` in such a way
> that the PEPs goal is achieved.
>

But that doesn't help a static type checker. You can't expect a static
checker to understand the implementation of a particular
`__subclasshook__`. In practice, except for a few "one-trick ponies" such
as Hashable, existing ABCs rely on subclassing or registration to make
isinstance() work, and for statically checking code that uses duck typing
those aren't enough.

> > The intention of this PEP is to solve all these problems by allowing
> users to write the above code without explicit base classes in the class
> definition
>
> As I understand this goal, you want to take what some of us in the
> community have been building ourselves and make it canonical via the
> stdlib.
>

Not really. The goal is to suggest implementation a frequently requested
feature in static checkers based on the type system laid out in PEP 484,
namely checks for duck types. To support this the type system from PEP 484
needs to be extended, and that's what PEP 544 is about.

> What strikes me as odd is that the focus is on 3rd party type checkers
> first rather than introducing this as a feature of the language runtime and
> then updating the type checker contract to make use of it.
>

This seems to be a misunderstanding, or at least an attempt to change the
agenda for PEP 544. The primary goal of the PEP is not to support runtime
checking but static checking. This is not new -- PEP 484 and PEP 526 before
it have also focused on features that are useful primarily for static
checkers. (Also, a bit of history: PEP 484 intentionally focused on static
checking support because there was widespread skepticism about the need for
more runtime checking, but there was a subset of the community that was
very interested in static checking.)

> I see a mention of the `isinstance` check support in the
> postponed/rejected ideas, but the only rationale given for it being in that
> category is, generally, "there are edge cases". For example, the PEP lists
> this as an edge case:
>
> >The problem with this is instance checks could be unreliable, except for
> situations where there is a common signature convention such as Iterable
>
> However, the sample given demonstrates precisely the expected behavior of
> checking if a concrete implements the protocol. It's unclear why this
> sample is given as a negative.
>

I assume we're talking about this example:

  class P(Protocol):
      def common_method_name(self, x: int) -> int: ...

  class X:
      <a bunch of methods>
      def common_method_name(self) -> None: ... # Note different signature

  def do_stuff(o: Union[P, X]) -> int:
      if isinstance(o, P):
          return o.common_method_name(1)  # oops, what if it's an X
instance?

The problem there is that the "state of the art" for runtiming checking
isinstance(o, P) boils down to hasattr(o, 'common_method_name') while the
type checker takes the method signatures into account, so it will consider
X objects not to be instances of P.

The other case given is:
>
> > Another potentially problematic case is assignment of attributes after
> instantiation
>
> Can you elaborate on how type checkers would not encounter this same
> issue? If there is a solution to this problem for type checkers, would that
> same solution not work at runtime? Also, it seems odd to use a custom
> initialize function rather than `__init__`. I don't think it was
> intentional, but this makes it seem like a bit of a strawman that doesn't
> represent typical Python code.
>

Lots of code I've seen initializes variables in a separate function
(usually called from `__init__`). Mypy, at least, considers instance
variables assigned through `self` in all methods of a class to be potential
instance variable declarations, otherwise a lot of code could not be
type-checked.

Again, the example is problematic given that the runtime check for
isinstance(c, P) can't do better than hasattr(c, 'x').  (I think there's a
typo in the PEP here, 'c1' should be 'c'.)

The need to use an explicit class decorator to add isinstance support is
used as a way to encourage developers to think about whether the runtime
instance check will match the picture as seen by the static checker, before
they turn on this decorator. It seems reasonable to me.

> Also, extensive use of ABCs might impose additional runtime costs.
>
> I'd love to see some data around this. Given that it's a rationale for the
> PEP I'd expect to see some numbers behind it. For example, is memory cost
> of directly registering implementations to abc linear or worse? What is the
> runtime growth pattern of isinstance or issubclass when used with heavily
> registered or deeply registered abc graphs and is it different than those
> calls on concrete class hierarchies? Does the cost affect anything more
> than the initial evaluation of the code or, in the absence of
> isinstance/issubclass checks, does it continue to have an impact on the
> runtime?
>

Its commonly known that ABCs are expensive (though if you want to do
precise measurements you're welcome). Here's one data point:
https://github.com/python/mypy/commit/1be4db7ac6e06a162355c3d5f7794d21b89a1056
-- it's a one-line diff that removes `metaclass=ABCMeta` from one class,
and the commit message reads:

Make the AST classes not ABCs

This results in roughly a 20% speedup on the non-parsing steps.
Here are the timings I got from running mypy on itself:

Before the change:
3861.8ms (49.0%) SemanticallyAnalyzedFile
2760.3ms (35.0%) UnprocessedFile
1111.8ms (14.1%) ParsedFile
 142.8ms ( 1.8%) PartiallySemanticallyAnalyzedFile

After the change:
3086.1ms (45.1%) SemanticallyAnalyzedFile
2665.1ms (39.0%) UnprocessedFile
 945.1ms (13.8%) ParsedFile
 139.6ms ( 2.0%) PartiallySemanticallyAnalyzedFile

-- 
--Guido van Rossum (python.org/~guido <http://python.org/%7Eguido>)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20170530/9c4ebd7c/attachment-0001.html>