[Python-3000] Parameter types and multiple method dispatch

Thu Jul 20 10:08:26 CEST 2006

I've been thinking about the proposed "parameter types" feature of 
Python 3000. And there are two aspects of the current proposal that I 
find somewhat troubling.

Let me first review what the proposal is. The idea is that a new syntax 
would be added to allow specification of a "function signature" - 
essentially a way to annotate individual parameters and have those 
annotations be accessible via the function object.

In addition, one of the key design points is that these annotations 
would have no fixed meaning, but would instead simply be objects that 
are interpreted in various contexts, including function decorators for 
which the annotations might have special meaning.

I'm going to go out on a limb here and say that both of these concepts 
are less useful than one would think, and are likely to create a host of 
problems. I suspect that the proposed design will lead to difficulty of 
interpretation, and conflicts between different but overlapping goals 
within the design of a program.

The first difficulty has to do with the way in which the arguments are 
represented, specifically the notion that each annotation is associated 
with a specific formal parameter.

A python function declaration is a kind of mapping between input 
arguments and formal parameters. As we know from reading the Python 
language reference, this mapping is in many instances a non-trivial 
transformation.

For example, if I have a function:

    def X( a=2, b=3, *c, **d ):
       ...

and I call it with:

    X( b=100, f=9 )

We end with with values of 'a=2, b=100, c=[], and d={f:9}. However, 
those values can only be accessed from within the function body - there 
is no way for a decorator or any other external code to know what those 
values are. Even if the decorator 'wraps' the function, it only knows 
what actual parameters were passed - it can't *directly* know the value 
of a, b, c, and d, although it can attempt to *emulate* the mapping by 
reimplementing the parameter assignment algorithm.

Now, normally decorators don't care about any of this - typically one 
writes a wrapper that simply has the same function signature as the 
method it wraps, and the code that calls the inner function is written 
by the programmer to reverse the transformation, so that the wrapped 
function can recieve the same arguments.

That technique is only useful, however, if the decorator is 
custom-written for functions that have a specific signature. Truly 
generic wrappers typically don't examine the arguments at all, they 
simply pass *args and **kwargs without looking at them too closely.

However, in order to use type annotations, one needs to associate 
between the actual arguments and the formal parameters - because the 
annotations are keyed by the formal parameter they are attached to. One 
way to do this is to simulate the mapping of arguments to formal 
parameters within the decorator. Once you know which formal parameter a 
given value is assigned to, you can then retrieve the annotation for 
that parameter and apply it to the value.

However, there are a number of drawbacks to doing this - first, the 
mapping algorithm isn't really that simple in the general case. Worse, 
the assignment of arguments to formal parameters is a prime suspect in 
the relative 'slowness' of Python function calls compared to other 
language features - and in order for the decorator to figure out which 
argument goes with which formal parameter, the decorator is now forced 
to perform that same mapping *again*.

Worse still if there is more than one decorator on the function, say 
adding multiple constraints, then each one has to perform this mapping step.

A different possibility is to attempt to do the mapping in reverse - 
that is, for a given formal parameter (and its associated type 
annotation), attempt to determine which input argument corresponds to 
that parameter.

The problem with this notion is that the parameter mapping algorithm is 
an iterative one - it would be difficult to come up with an algorithm 
that does Python parameter assignment *in reverse*, that was guaranteed 
to be correct for all possible edge cases.

All of this is fairly trivial for the simple cases (i.e. a function with 
only positional arguments), but that assumes that the decorator has some 
foreknowledge of the kinds of functions that it will decorate. The 
problem is, you don't *need* type annotations for that kind of decorator 
- because if the decorator already has _a priori_ knowledge of the 
function, then usually it will know enough to know what the types of the 
arguments are supposed to be!

Type annotations are more likely to be useful for truly generic 
decorators which will take action based on an inspection of the 
annotations. But the problem is that these are exactly the kind of 
decorators for which the parameter mapping problem is most acute.

My second difficulty is with the notion that type annotations have no 
fixed meaning. The problem here is that, unlike decorators which are 
essentially functions, type annotations are more like data.

A function interface with no assigned meaning (such as a plug-in) can be 
quite useful because functions carry their own 'meaning' with them, 
buried inside the function. Data, on the other hand, requires some 
external agency to interpret them in order to convey meaning.

By saying that annotations have no fixed meaning, what we're really 
saying is that there's no "standard interpretation" of these 
annotations, and that's where IMHO the trouble lies. If there's no 
standard interpretation, then everyone is going to interpret them 
differently. A function with more than one independently-written 
decorator is going to have problems, each decorator trying to pick out 
the parts of the annotations that's meant for them and not the others.

It seems to me that a better solution is to allow the *decorators* to 
have function signatures. For example:

    @overload(int,int)
    def add(a,b):
       ...

The advantage of this is twofold:

First, it means that each decorator can have its own separate signature, 
  independent of any other. That means there's no longer any conflict 
between different decorators attempting to appply different 
interpretations to the same data.

Second, it means that the mapping problem hasn't entirely gone away, but 
has somewhat been reduced, because now the type annotations are 
referring to the *wrapper function's* arguments, not the arguments of 
the wrapped function. This means that the wrapper is now on the 
'inside', and can see all of the variables in their mapped configuration.

Of course, there is still the problem of passing the arguments to the 
inner function - but that's no worse than the problem that decorators 
have today.

The only drawback here is that the syntactical options for expressing 
type signatures as arguments to the decorators are somewhat limited. For 
example, '@overload(int,int,a=int)' doesn't express the notion that a is 
a keyword argument that should be constrained to an int value.

This is more due to a general characteristic of Python, which is that 
syntactical and semantic phases of interpretation of the language are 
kept as discretely separate as possible. Specifically, Python lacks the 
means to pass syntactical constructs as arguments to functions.

-- Talin