[I added some blank lines to separate the PEP quotes from Kevin's responses.] On Mon, May 29, 2017 at 7:51 AM, Kevin Conway <kevinjacobconway@gmail.com> wrote:
From the PEP:
The problem with them is that a class has to be explicitly marked to support them, which is unpythonic and unlike what one would normally do in idiomatic dynamically typed Python code. The same problem appears with user-defined ABCs: they must be explicitly subclassed or registered.
Neither of these statements are entirely true. The semantics of `abc` allow for exactly the kind of detached interfaces this PEP is attempting to provide. The `abc.ABCMeta` provides the `__subclasshook__` which allows a developer to override the default check of internal `abc` registry state with virtually any logic that determines the relationship of a class with the interface. The prior art I linked to earlier in the thread uses this feature to generically support `issubclass` and `isinstance` in such a way that the PEPs goal is achieved.
But that doesn't help a static type checker. You can't expect a static checker to understand the implementation of a particular `__subclasshook__`. In practice, except for a few "one-trick ponies" such as Hashable, existing ABCs rely on subclassing or registration to make isinstance() work, and for statically checking code that uses duck typing those aren't enough.
The intention of this PEP is to solve all these problems by allowing users to write the above code without explicit base classes in the class definition
As I understand this goal, you want to take what some of us in the community have been building ourselves and make it canonical via the stdlib.
Not really. The goal is to suggest implementation a frequently requested feature in static checkers based on the type system laid out in PEP 484, namely checks for duck types. To support this the type system from PEP 484 needs to be extended, and that's what PEP 544 is about.
What strikes me as odd is that the focus is on 3rd party type checkers first rather than introducing this as a feature of the language runtime and then updating the type checker contract to make use of it.
This seems to be a misunderstanding, or at least an attempt to change the agenda for PEP 544. The primary goal of the PEP is not to support runtime checking but static checking. This is not new -- PEP 484 and PEP 526 before it have also focused on features that are useful primarily for static checkers. (Also, a bit of history: PEP 484 intentionally focused on static checking support because there was widespread skepticism about the need for more runtime checking, but there was a subset of the community that was very interested in static checking.)
I see a mention of the `isinstance` check support in the postponed/rejected ideas, but the only rationale given for it being in that category is, generally, "there are edge cases". For example, the PEP lists this as an edge case:
The problem with this is instance checks could be unreliable, except for situations where there is a common signature convention such as Iterable
However, the sample given demonstrates precisely the expected behavior of checking if a concrete implements the protocol. It's unclear why this sample is given as a negative.
I assume we're talking about this example: class P(Protocol): def common_method_name(self, x: int) -> int: ... class X: <a bunch of methods> def common_method_name(self) -> None: ... # Note different signature def do_stuff(o: Union[P, X]) -> int: if isinstance(o, P): return o.common_method_name(1) # oops, what if it's an X instance? The problem there is that the "state of the art" for runtiming checking isinstance(o, P) boils down to hasattr(o, 'common_method_name') while the type checker takes the method signatures into account, so it will consider X objects not to be instances of P. The other case given is:
Another potentially problematic case is assignment of attributes after instantiation
Can you elaborate on how type checkers would not encounter this same issue? If there is a solution to this problem for type checkers, would that same solution not work at runtime? Also, it seems odd to use a custom initialize function rather than `__init__`. I don't think it was intentional, but this makes it seem like a bit of a strawman that doesn't represent typical Python code.
Lots of code I've seen initializes variables in a separate function (usually called from `__init__`). Mypy, at least, considers instance variables assigned through `self` in all methods of a class to be potential instance variable declarations, otherwise a lot of code could not be type-checked. Again, the example is problematic given that the runtime check for isinstance(c, P) can't do better than hasattr(c, 'x'). (I think there's a typo in the PEP here, 'c1' should be 'c'.) The need to use an explicit class decorator to add isinstance support is used as a way to encourage developers to think about whether the runtime instance check will match the picture as seen by the static checker, before they turn on this decorator. It seems reasonable to me.
Also, extensive use of ABCs might impose additional runtime costs.
I'd love to see some data around this. Given that it's a rationale for the PEP I'd expect to see some numbers behind it. For example, is memory cost of directly registering implementations to abc linear or worse? What is the runtime growth pattern of isinstance or issubclass when used with heavily registered or deeply registered abc graphs and is it different than those calls on concrete class hierarchies? Does the cost affect anything more than the initial evaluation of the code or, in the absence of isinstance/issubclass checks, does it continue to have an impact on the runtime?
Its commonly known that ABCs are expensive (though if you want to do precise measurements you're welcome). Here's one data point: https://github.com/python/mypy/commit/1be4db7ac6e06a162355c3d5f7794d21b89a10... -- it's a one-line diff that removes `metaclass=ABCMeta` from one class, and the commit message reads: Make the AST classes not ABCs This results in roughly a 20% speedup on the non-parsing steps. Here are the timings I got from running mypy on itself: Before the change: 3861.8ms (49.0%) SemanticallyAnalyzedFile 2760.3ms (35.0%) UnprocessedFile 1111.8ms (14.1%) ParsedFile 142.8ms ( 1.8%) PartiallySemanticallyAnalyzedFile After the change: 3086.1ms (45.1%) SemanticallyAnalyzedFile 2665.1ms (39.0%) UnprocessedFile 945.1ms (13.8%) ParsedFile 139.6ms ( 2.0%) PartiallySemanticallyAnalyzedFile -- --Guido van Rossum (python.org/~guido <http://python.org/%7Eguido>)