On Wed, Sep 9, 2015 at 10:48 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
On Sep 9, 2015, at 21:34, Jukka Lehtosalo <jlehtosalo@gmail.com> wrote:
I'm not sure if I fully understand what you mean by implicit vs. explicit
ABCs (and the static/runtime distinction). Could you define these terms and
maybe give some examples of each?
I just gave examples just one paragraph above.
A (runtime) implicit ABC is something that uses a __subclasshook__
(usually implementing a structural check). So, for instance, any type that
implements __iter__ is-a Iterable, e.g., according to isinstance or
issubclass or @singledispatch, because that's what
Iterable.__subclasshook__ checks for.
A (runtime) explicit ABC is something that isn't implicit, like Sequence:
no hook, so nothing is-a Sequence unless it either inherits the ABC or
registers with it.
You're proposing a parallel but separate distinction at static typing
time. Any ABC that's a Protocol is checked based on a structural check;
otherwise, it's checked based on inheritance.
In my proposal I actually suggest that protocols shouldn't support
isinstance or issubclass (these operations should raise an exception) by
default. A protocol is free to override the default exception-raising
__subclasshook__ to implement a structural check, and a static type checker
would allow isinstance and issubclass for protocols that do this. I'll need
to explain this idea in more detail, as clearly the current explanation is
too easy to misundertand.
Here's a concrete example:
class X(Protocol):
def f(self): ...
class A:
def f(self): print('f')
if isinstance(A(), X): ... # Raise an exception, because no
__subclasshook__ override in X
Previously I toyed with the idea of having a default implementation of
__subclasshook__ that actually does a structural check, but I'm no longer
sure if that would be desirable, as it's difficult to come up with an
implementation that does the right thing in all reasonable cases. For
example, consider a structural type like this that people might want to use
to work around the current limitations of Callable (it doesn't support
keyword arguments, for example):
class MyCallable(Protocol):
def __call__(self, x, y): ...
(This example has some other potential issues that I'm hand-waving away for
now.)
Now how would the default isinstance work? Preferably it should only accept
callables that are compatible with the signature, but doing that check is
pretty difficult for arbitrary functions and should probably be out of
scope for the typing module. Just checking whether __call__ exists would be
too general, as the programmer probably expects that he's able to call the
method with the specific arguments the type suggests. Also, sometimes
checking the argument names would be a good thing to do, but sometimes any
names (as long the the number of arguments is compatible) would be fine.
This means it's now possible to create supertypes that are implicit at
runtime but explicit at static typing time (which might occasionally be
useful), or vice-versa (which I can't imagine why you'd ever want).
As I showed above, you wouldn't get the latter unless you really try very
hard (consenting adults and all).
Besides the obvious negatives in having two not-quite-compatible and
very-different-looking ways of expressing the same concept, this is going
to lead to people wanting to know why their type checker is complaining
about perfectly good code ("I tested that constant with isinstance, and it
really is-a Spammable, and the type checker is inferring its type properly,
and yet I get an error passing it to a function that wants a Spammable") or
allowing blatantly invalid code ("I annotated my function to only take
Spammable arguments, but someone is passing something that calls the
fallback implementation of my singledispatch function instead of the
Spammable overload").
I agree that having the default nominal/explicit isinstance semantics for a
protocol type would be a very bad idea.
Maybe the solution is to expand your proposal a little: make Protocol
automatically create a __subclasshook__ (which you listed as an optional
idea in the proposal), and also change all of the existing stdlib implicit
ABCs to Protocols and scrap their manual hooks, and also update the
relevant documentation (e.g., the abc module and the data model section on
__subclasshook__) to recommend using Protocol instead of implementing a
manual hook if the only thing you want is structural subtyping. Of course
the backward compatibility isn't perfect (unless you want to manually munge
up collections.abc when typing is imported), and people using legacy
third-party code might need to add stubs (although that seems necessary
anyway). But for most people, everything should just work as people expect.
A type is either structurally typed or explicitly (via inheritance or
registration) types, both at static typing time and a runtime, and that's
always expressed by the name Protocol. (But for the rare cases when you
really need a type check that's looser at runtime, you can still write a
manual hook to handle that.)
Yeah, this would be nice, but as I argued above, implementing a generic
__subclasshook__ is actually quite tricky.
Jukka