[Python-3000] Adaptation vs. Generic Functions

Sat Apr 8 10:28:24 CEST 2006

At 10:57 AM 4/7/2006 -0700, Guido van Rossum wrote:
>Here I'm not so sure. Are you sure those results are unexpected? It
>seems you are saying that if I have a class C deriving from A and B
>(in that order) where both A and B implement a method foo(), and C
>doesn't, the call C().foo() would be ambiguous. But in fact the whole
>point of having a defined MRO is to foster the expectation that it
>will call A's foo(), because that one comes first in the MRO.
>
>So at least for the single-argument variant I would think that my
>algorithm gives results consistent with the definition of inheritance
>in new-style classes.

The ambiguity is only in context, or rather, the requirement for 
context.  If you use the argument's MRO, then you can't statically 
determine the dominance hierarchy, because whether A or B dominates depends 
on the argument you're actually testing at runtime!  This makes it harder 
for you to tell ahead of time whether you've defined the right methods.

In particular, note that if you have cases for A and for B, you can easily 
be misled into thinking that one (let's say A) dominates the other because 
you currently only have class C(A,B).  And you can then conclude that your 
list of cases is comprehensive and unambiguous -- until somebody creates 
class D(B,A) and suddenly your rules are ambiguous again.

By the way, my rewrite of dominates() actually had a bug in it also, as did 
your original.  Yours used an 'is' to compare the tuple signatures where it 
should've used ==, and mine checked individual types with 'is' instead of 
comparing the entire tuple with ==.  Instead of:

     all(d is not s and issubclass(d,s) for d,s in zip(dom, sub))

I should've written:

     dom != sub and all(issubclass(d,s) for d,s in zip(dom, sub))

> > Unfortunately, this entire approach is subject to similar scale
> > issues as PyProtocols/RuleDispatch.
>
>Even with the cache I put in? The hairy algorithm doesn't get invoked
>more than once per actual signature (type tuple).

You've measured it now, but yes, my own measurements of similar techniques 
in RuleDispatch showed the same effect, and as with your measurement, it is 
the Python code to build the tuple, rather than the tuple allocation 
itself, that produces the delays.  The current version of RuleDispatch goes 
to some lengths to avoid creating any temporary data structures (if 
possible) for this reason.

>I like the idea of analyzing the methods for dominance relationships
>ahead of time. But I'm not sure why we should then resort to
>generating code. Shouldn't it be just as easy to create a data
>structure that can be walked by fixed code? (Just like a
>recursive-descent parser can be written as a fixed "machine" that
>interprets a data structure representing the parse information.)

That's what RuleDispatch does currently, but it's again considerably slower 
than an AST-manipulation approach would be, and is also much slower when it 
has to handle expressions other than "plain old arguments".

Calling it "generating code", however, is a misleading way of thinking 
about it, in that it implies something much more complex than is 
necessary.  The data structure created to do dispatching is not much 
different from an AST, and in fact RuleDispatch uses its own simple AST for 
processing expressions anyway.  The parts of RuleDispatch that process 
expressions are effectively recapitulating the Python interpreter, so I 
would much rather in a future version just take an AST right off of a 
function, like this:

     @some_function.when(lambda: x<27 and isinstance(y,int))
     def twenty_seven_and_int(x,y):
         ...

to replace my current use of string parsing for these conditions.  At that 
point RuleDispatch would just basically be performing relatively simple AST 
pruning and grafting, to take the conditional expression(s) and function 
bodies and graft them into a larger dispatch tree.  Since we already have 
the mechanisms to manage such ASTs and generate bytecode, and since 
alternative implementations are going to have to provide the same AST APIs 
at some point, ISTM that this is the natural way to go for the long term, 
and provides the most opportunities for customized dispatching techniques, 
such as RuleDispatch's custom method combinations.  (Which allow you to 
control such things as whether there are "next methods" or not, among other 
things.)

Note that such AST prune-and-graft operations could also potentially 
implement things like a 'next_method()' call, by substituting the 
function's previous AST.  This would also allow you to do a simple 
monkeypatching style of overloading (i.e, just replacing the existing 
function body and chaining) for the very simplest cases with only a handful 
of methods.

And, setting a function's AST needn't immediately cause compilation; it 
could be delayed until the function is actually called, thus nullifying any 
cost besides the tree grafting itself, for functions that aren't used in a 
particular program run.  Granted, this definition-time cost might still be 
higher than the defintion-time cost of a more limited dispatch facility, 
such as the "tuple of types" approach you've been prototyping.

In truth, my thoughts on AST munging are still essentially speculation; 
they reflect my design for the next version of RuleDispatch, rather than an 
implemented and tested design at this point.

I should also clarify that I think of dispatch from a slightly different 
perspective, since RuleDispatch allows tests on arbitrary conditions and 
expressions.  As a result, there are constraints on ordering of tests, 
because of things like a condition for "x>0 and y/x<27", where the second 
expression would produce an error if it was evaluated out of 
order.  RuleDispatch already has the necessary machinery to manage these 
issues and build a single dispatch tree for interpretation, however, and 
it's actually not that complex: RuleDispatch in total is only about 4000 
lines of Python, not including tests, but including all those blank lines I 
put between screenfuls of code.  :)

It does not reuse any of the stdlib compiler library, either, as I found it 
too slow and too big for the relatively simple task of processing logical 
expressions -- the expression processor is ~700 lines, or ~1100 if you 
include the concrete syntax->abstract syntax converter.  Some of this 
machinery wouldn't be needed if there were a nice fast AST API and I could 
pull the ASTs directly from functions instead of parsing them.

That leaves about 2900 lines of "meat": basically the index classes for 
handling class lookups and other comparison operators, and the code to 
build and interpret dispatch trees.  The interpretation part would of 
course go away under an AST regime, while the rest would stay about the same.

Naturally, if your goals are more modest, dealing only with actual 
arguments, no expressions, no operators other than 'isinstance()', no 
ability to apply boolean logic to the conditions (e.g. "isinstance(arg1,A) 
and isinstance(arg1,B)", or perhaps "not isinstance(arg1,C)!"), then the 
logic for an interpreter is indeed much simpler.  In that case, you can 
just use a series of dictionaries that each argument is checked against, 
one at a time, and you can build those dictionaries in a straightforward 
way at registration time.

In that approach, the lookup loop could look something like:

     reg = self.root_registry
     for arg in args:
         for cls in type(arg).__mro__:   # not exactly, but sort of
             if cls in reg:
                 reg = reg[cls]
                 break
         else:
             raise NoMatch  # (or fall back to default)

     # reg is now a function or a placeholder indicating ambiguity

You wouldn't actually use __mro__, in order to avoid the unstable dominance 
hierarchy problem, but you get the gist.  If you use RuleDispatch and 
restrict what operations you use, you'll actually get almost exactly the 
above; it's just that RuleDispatch may not evaluate the args in strictly 
left-to-right order, depending on how "good" an argument is for 
dispatching.  (RuleDispatch evaluates the "goodness" of expressions (such 
as arguments) by how well they fan out and flatten the dispatch tree, so it 
may decide that the last argument is in fact a better thing to check first.

This approach of course has more hash table lookups than your approach, so 
these are all tradeoffs to be considered, and more importantly, 
measured.  But in general I tend to see this space through the lens of 
"arbitrary boolean expressions" and thus may occasionally intuit wrong 
answers for the more limited scope you're contemplating here.  Actually, I 
keep feeling like you're "cheating" because you can get away with so little 
code -- and I keep having to remind myself that it's because there are so 
many things that the cached type tuple approach cannot.  (The approach 
which you are using, and which other folks wrote papers about, prior to the 
Chambers & Chen paper on using dispatch trees to handle arbitrary predicates.)

Anyway, I just got home from a 6+ hour flight, and I'm probably not being 
very coherent right now, so I'm going to go get some sleep.  :)