[Python-ideas] Improving the expressivity of function annotations

Masklinn masklinn at masklinn.net
Mon Apr 4 10:31:11 CEST 2011


While the PEP 3107 annotations are pretty nice, and an interesting improvements re third-party static analysis of Python code, I think they are insufficient in three areas which makes them… insufficient. And even if third-parties solved these issues independently, it likely would not reach criticality.

type parameters (generics)
==========================

Python simply has no syntax (at the moment) to express the idea of generics and generic containers. The simplest way (and one I find syntactically clean, YMMV) would be to define ``__getitem__`` on ``type`` instances (with e.g. ``list[int]`` yielding a list with some kind of ``generics`` attribute), but that is of course unavailable to third-party libraries as they should not override builtins. Plus it would have to be added on ``type`` itself to ensure all types have this property (and I am not sure how C types are defined in relation to ``type``).

The possibility to perform this call would probably need to be opt-in or opt-out, as generics don't make sense for all types (or maybe all types could accept an empty generic spec e.g. ``int[]``? though I'm not sure that is even valid python syntax)

Type parameters greatly increase the expressivity of a type specification and make deep type checks possible, creating much static safety by avoiding implicit or explicit (unsafe) type casts.

sum types
=========

It is not uncommon for Python functions and methods to take or return a value from a set of types rather than from a type itself. That is also (I would guess) one of the reasons why ``isinstance`` can take a tuple of types rather than being limited to a single type (as with Javascript's or Java's ``instanceof``).

As previously, it would probably be easy to alter ``type`` to syntactically support this construct: defining ``__or__`` on ``type`` as yielding some kind of TypeSet (or Types) instance, which would be a set of all OR'ed types (with the right subclass hooks set up so ``isinstance`` works correctly).

It would result in something like:

    Sum = (A | B | C)
    assert issubclass(A, Sum)
    assert issubclass(B, Sum)
    assert issubclass(C, Sum)

Structural types
================

The third issue is also the biggest by far. A small portion of it is resolved by abcs (especially if tools include special support for ABCs), but not all of it by a long shot: statically defining and checking "duck types". abcs check for structure, so they are a convenient label for *some* "duck" types but their meaning isn't even well defined (e.g. does the ``item: Mapping`` signature mean ``item`` implements the abstract methods of Mapping, or does it have to *also* implement the mixin methods, even if Mapping was not actually mixed in it?)

Anyway this is where "structural types" come in: defining a type not by its name but by its shape (a set of methods and properties, and their signatures).

Again, this can be handled by defining (enough) abcs, but it soon becomes tiresome as ABCs are still pretty verbose to write and I don't doubt there would soon be about 5 million versions of ``Readable`` (requirement of a single no-args method ``read`` producing bytes). As a result, the ability to define a type as a set of methods (and their signatures) without having to give it an explicit name would — I think — be great.

On the other hand, this is *the* case for which I have no syntactical idea, definitely not a simple one, by simply augmenting existing types with new properties (without introducing actual new syntax). The OCaml syntax ``< method: (*input) -> output; method2: (*input) -> output; property: output >`` definitely does not fit.

Your thoughts?


More information about the Python-ideas mailing list