[Python-ideas] Structural type checking for PEP 484

Jukka Lehtosalo jlehtosalo at gmail.com
Fri Sep 11 07:01:38 CEST 2015


On Wed, Sep 9, 2015 at 10:48 PM, Andrew Barnert <abarnert at yahoo.com> wrote:

> On Sep 9, 2015, at 21:34, Jukka Lehtosalo <jlehtosalo at gmail.com> wrote:
>
> I'm not sure if I fully understand what you mean by implicit vs. explicit
> ABCs (and the static/runtime distinction). Could you define these terms and
> maybe give some examples of each?
>
>
> I just gave examples just one paragraph above.
>
> A (runtime) implicit ABC is something that uses a __subclasshook__
> (usually implementing a structural check). So, for instance, any type that
> implements __iter__ is-a Iterable, e.g., according to isinstance or
> issubclass or @singledispatch, because that's what
> Iterable.__subclasshook__ checks for.
>
> A (runtime) explicit ABC is something that isn't implicit, like Sequence:
> no hook, so nothing is-a Sequence unless it either inherits the ABC or
> registers with it.
>
> You're proposing a parallel but separate distinction at static typing
> time. Any ABC that's a Protocol is checked based on a structural check;
> otherwise, it's checked based on inheritance.
>

In my proposal I actually suggest that protocols shouldn't support
isinstance or issubclass (these operations should raise an exception) by
default. A protocol is free to override the default exception-raising
__subclasshook__ to implement a structural check, and a static type checker
would allow isinstance and issubclass for protocols that do this. I'll need
to explain this idea in more detail, as clearly the current explanation is
too easy to misundertand.

Here's a concrete example:

class X(Protocol):
    def f(self): ...

class A:
    def f(self): print('f')

if isinstance(A(), X): ...   # Raise an exception, because no
__subclasshook__ override in X

Previously I toyed with the idea of having a default implementation of
__subclasshook__ that actually does a structural check, but I'm no longer
sure if that would be desirable, as it's difficult to come up with an
implementation that does the right thing in all reasonable cases. For
example, consider a structural type like this that people might want to use
to work around the current limitations of Callable (it doesn't support
keyword arguments, for example):

class MyCallable(Protocol):
    def __call__(self, x, y): ...

(This example has some other potential issues that I'm hand-waving away for
now.)

Now how would the default isinstance work? Preferably it should only accept
callables that are compatible with the signature, but doing that check is
pretty difficult for arbitrary functions and should probably be out of
scope for the typing module. Just checking whether __call__ exists would be
too general, as the programmer probably expects that he's able to call the
method with the specific arguments the type suggests. Also, sometimes
checking the argument names would be a good thing to do, but sometimes any
names (as long the the number of arguments is compatible) would be fine.


> This means it's now possible to create supertypes that are implicit at
> runtime but explicit at static typing time (which might occasionally be
> useful), or vice-versa (which I can't imagine why you'd ever want).
>

As I showed above, you wouldn't get the latter unless you really try very
hard (consenting adults and all).


>
> Besides the obvious negatives in having two not-quite-compatible and
> very-different-looking ways of expressing the same concept, this is going
> to lead to people wanting to know why their type checker is complaining
> about perfectly good code ("I tested that constant with isinstance, and it
> really is-a Spammable, and the type checker is inferring its type properly,
> and yet I get an error passing it to a function that wants a Spammable") or
> allowing blatantly invalid code ("I annotated my function to only take
> Spammable arguments, but someone is passing something that calls the
> fallback implementation of my singledispatch function instead of the
> Spammable overload").
>

I agree that having the default nominal/explicit isinstance semantics for a
protocol type would be a very bad idea.


>
> Maybe the solution is to expand your proposal a little: make Protocol
> automatically create a __subclasshook__ (which you listed as an optional
> idea in the proposal), and also change all of the existing stdlib implicit
> ABCs to Protocols and scrap their manual hooks, and also update the
> relevant documentation (e.g., the abc module and the data model section on
> __subclasshook__) to recommend using Protocol instead of implementing a
> manual hook if the only thing you want is structural subtyping. Of course
> the backward compatibility isn't perfect (unless you want to manually munge
> up collections.abc when typing is imported), and people using legacy
> third-party code might need to add stubs (although that seems necessary
> anyway). But for most people, everything should just work as people expect.
> A type is either structurally typed or explicitly (via inheritance or
> registration) types, both at static typing time and a runtime, and that's
> always expressed by the name Protocol. (But for the rare cases when you
> really need a type check that's looser at runtime, you can still write a
> manual hook to handle that.)
>
>
Yeah, this would be nice, but as I argued above, implementing a generic
__subclasshook__ is actually quite tricky.

Jukka
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150910/84dad0af/attachment-0001.html>


More information about the Python-ideas mailing list