On Wed, Sep 9, 2015 at 10:48 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
On Sep 9, 2015, at 21:34, Jukka Lehtosalo <jlehtosalo@gmail.com> wrote:
I'm not sure if I fully understand what you mean by implicit vs. explicit ABCs (and the static/runtime distinction). Could you define these terms and maybe give some examples of each?

I just gave examples just one paragraph above.

A (runtime) implicit ABC is something that uses a __subclasshook__ (usually implementing a structural check). So, for instance, any type that implements __iter__ is-a Iterable, e.g., according to isinstance or issubclass or @singledispatch, because that's what Iterable.__subclasshook__ checks for.

A (runtime) explicit ABC is something that isn't implicit, like Sequence: no hook, so nothing is-a Sequence unless it either inherits the ABC or registers with it.

You're proposing a parallel but separate distinction at static typing time. Any ABC that's a Protocol is checked based on a structural check; otherwise, it's checked based on inheritance.

In my proposal I actually suggest that protocols shouldn't support isinstance or issubclass (these operations should raise an exception) by default. A protocol is free to override the default exception-raising __subclasshook__ to implement a structural check, and a static type checker would allow isinstance and issubclass for protocols that do this. I'll need to explain this idea in more detail, as clearly the current explanation is too easy to misundertand.

Here's a concrete example:

class X(Protocol):
    def f(self): ...

class A:
    def f(self): print('f')

if isinstance(A(), X): ...   # Raise an exception, because no __subclasshook__ override in X

Previously I toyed with the idea of having a default implementation of __subclasshook__ that actually does a structural check, but I'm no longer sure if that would be desirable, as it's difficult to come up with an implementation that does the right thing in all reasonable cases. For example, consider a structural type like this that people might want to use to work around the current limitations of Callable (it doesn't support keyword arguments, for example):

class MyCallable(Protocol):
    def __call__(self, x, y): ...

(This example has some other potential issues that I'm hand-waving away for now.)

Now how would the default isinstance work? Preferably it should only accept callables that are compatible with the signature, but doing that check is pretty difficult for arbitrary functions and should probably be out of scope for the typing module. Just checking whether __call__ exists would be too general, as the programmer probably expects that he's able to call the method with the specific arguments the type suggests. Also, sometimes checking the argument names would be a good thing to do, but sometimes any names (as long the the number of arguments is compatible) would be fine.


This means it's now possible to create supertypes that are implicit at runtime but explicit at static typing time (which might occasionally be useful), or vice-versa (which I can't imagine why you'd ever want).

As I showed above, you wouldn't get the latter unless you really try very hard (consenting adults and all). 
 

Besides the obvious negatives in having two not-quite-compatible and very-different-looking ways of expressing the same concept, this is going to lead to people wanting to know why their type checker is complaining about perfectly good code ("I tested that constant with isinstance, and it really is-a Spammable, and the type checker is inferring its type properly, and yet I get an error passing it to a function that wants a Spammable") or allowing blatantly invalid code ("I annotated my function to only take Spammable arguments, but someone is passing something that calls the fallback implementation of my singledispatch function instead of the Spammable overload").

I agree that having the default nominal/explicit isinstance semantics for a protocol type would be a very bad idea.
 

Maybe the solution is to expand your proposal a little: make Protocol automatically create a __subclasshook__ (which you listed as an optional idea in the proposal), and also change all of the existing stdlib implicit ABCs to Protocols and scrap their manual hooks, and also update the relevant documentation (e.g., the abc module and the data model section on __subclasshook__) to recommend using Protocol instead of implementing a manual hook if the only thing you want is structural subtyping. Of course the backward compatibility isn't perfect (unless you want to manually munge up collections.abc when typing is imported), and people using legacy third-party code might need to add stubs (although that seems necessary anyway). But for most people, everything should just work as people expect. A type is either structurally typed or explicitly (via inheritance or registration) types, both at static typing time and a runtime, and that's always expressed by the name Protocol. (But for the rare cases when you really need a type check that's looser at runtime, you can still write a manual hook to handle that.)


Yeah, this would be nice, but as I argued above, implementing a generic __subclasshook__ is actually quite tricky.

Jukka