[Python-Dev] perplexed by mro

Guido van Rossum guido@python.org
Thu, 03 Oct 2002 18:03:45 -0400


> I'm trying to wrap my head around the mro computation in 2.2.

Thanks for doing this!  You've stumbled upon a place where I'm no
expert, and where my intuition about the equivalence between
algorithms turned out to be wrong.

> First issue: PEP 253 and the descrintro document describe an
> algorithm, typeobject.c implements a rather different
> one. Descrintro mentions that.
> 
> The algorithm described by descrintro, PEP253:
> 
> "Using the classic lookup rule, construct the list of classes that
> would be searched, including duplicates. Now for each class that
> occurs in the list multiple times, remove all occurrences except for
> the last."
> 
> which, if I understand correctly, means : do a left-right dfs of the
> class hierarchy and then discard all but the last occurence of each
> class.
> 
> Now descrintro says that the result of this and the real algorithm
> in typeobject are the same, unless
> 
> "two given base classes occur in a different order in the
> inheritance list of two different derived classes, and those derived
> classes are both inherited by yet another class"
> 
> But here is an example where the criterium does not apply but still
> the algorithms' results diverge:
> 
> >>> cpl.ex5(globals())
> class a(object): pass
> class b(object): pass
> class c(object): pass
> class x(a): pass
> class y(a): pass
> class z(x,b,y,c): pass
> >>> cpl.names(z.__mro__) # just the class names, real 2.2 mro algo
> ['z', 'x', 'y', 'a', 'b', 'c', 'object']
> >>> cpl.names(cpl.keeplast(cpl.lrdfs(z))) # descrintro simple algo
> ['z', 'x', 'b', 'y', 'a', 'c', 'object']

I wanted to say that the naive algorithm is "better" than the 2.2 algo
here because it doesn't reverse the order of b and y given by z's
bases.  But let's have a look at an example.

Suppose there's a method m that's defined in object, and overridden by
class y as well as in b.  What is z.m?  The 2.2 algo says y.m, the
naive algorithm says b.m.  Hard to decide which is better.  Now let's
assume m is *also* overridden in a, so that y.m overrides a.m
overrides object.m.  Now if we searched b before y (naive algo),
it is arguably strange if the 'super' call chain went from y.m via b.m
to a.m.

But I'm not sure how to formalize this intuition.

> I was not able to find a new rule. My intuition is that when the
> result of the real mro has something to do with the dfs, it is sure
> from the beginning of it not the end, so the relation of the two
> algos is unclear.
> 
> In general I have an hard time trying to predict the result of the
> real mro algo.
> 
> Does it a have a list of constraints it enforces/preserves, of the
> qualities it has?

The book "Putting Metaclasses To Work" that I always quote about this
has the algorithm and the rules it tries to enforce.  I don't have the
book handy here so I can't quote it.  I'm pretty sure that I
implemented their merge algo correctly, EXCEPT that I don't raise an
error or what they call "serious order conflicts".  There are no
serious order conflicts in this example though.

> I have tried to compare it to the algorithms used by Dylan and CLOS
> and others, described in this paper
> 
> http://www.webcom.com/haahr/dylan/linearization-oopsla96.html
> 
> they are described in terms such as quality criteria and constraints
> they preserve.

Wish I could spend the time to study and understand all that.

> Let's consider this example:
> 
> >>> cpl.ex9(globals())
> class a(object): pass
> class b(object): pass
> class c(object): pass
> class d(object): pass
> class e(object): pass
> class k1(a,b,c): pass
> class k2(d,b,e): pass
> class k3(d,a): pass
> class z(k1,k2,k3): pass
> >>> cpl.names(z.__mro__)
> ['z', 'k1', 'k3', 'a', 'k2', 'd', 'b', 'c', 'e', 'object']

I think this one *does* have an order conflict, and in that case all
bets are off about the book's algo.  If you look at k1 and k2, clearly
d must come after a; yet k3 wants it after a.

> the thing that leaves me most perplexed is that k3 appears before
> k2. All other algortihms I know would put k2 before k3.  E.g. C3
> (see paper) ...
> 
> >>> cpl.names(cpl.c3_cpl(z))
> ['z', 'k1', 'k2', 'k3', 'd', 'a', 'b', 'c', 'e', 'object']
> 
> or the descrintro simple algo:
> 
> >>> cpl.names(cpl.keeplast(cpl.lrdfs(z)))
> ['z', 'k1', 'c', 'k2', 'b', 'e', 'k3', 'd', 'a', 'object']
> 
> This means that it does not preserve the local precedence lists
> (i.e. the inheritance list), it is not monotonic* because otherwise
> it would put d before a, it seems to preserve "some" aspects of the
> dfs traversal a,d vs d,a but still it puts k3 before k2.
> 
> The other hypothesis, is that in this case, the algorithm should
> fail, because it cannot merge the result of mro(k1) merged with
> mro(k2) with mro(k3), because the latter is inconsistent with what
> obtained so far?

Yes.  I've so far had a "don't do that then" attitude rather than a
"detect all errors" attitude here -- mostly because the order conflict
detection code is hairy to write in C.

> >>> l=cpl.names(k1.__mro__)
> >>> l
> ['k1', 'a', 'b', 'c', 'object']
> >>> r=cpl.names(k2.__mro__)
> >>> cpl.conservative_merge(l,r)
>  ...
> >>> l
> ['k1', 'a', 'k2', 'd', 'b', 'c', 'e', 'object']
> 
> vs.
> 
> >>> cpl.names(k3.__mro__)
> ['k3', 'd', 'a', 'object']
> 
> a<d is incosistent with d<a
> 
> But then this inconsistency rule is different/subtler than the one
> sketched in descrintro.
> 
> So can someone (Guido?) summarize the qualities of this rule? is
> there a paper that describes it? or should I buy the book?

Buy the book.  Last time I looked it was on sale (used) at Amazon.com.

> regards.
> 
> * the mro of a subclass is an extension without re-ordering of the
>   mros of the superclasses.

(I don't think your second post has anything else to which I need to
respond, right?)

--Guido van Rossum (home page: http://www.python.org/~guido/)