inheritance, multiple inheritance and the weaklist and instance dictionaries

Wed Feb 9 16:14:59 EST 2011

On 02/09/2011 02:42 PM, Carl Banks wrote:
> On Feb 9, 10:54 am, Rouslan Korneychuk<rousl... at msn.com>  wrote:
>> I'm working on a program that automatically generates C++ code for a
>> Python extension and I noticed a few limitations when using the weaklist
>> and instance dictionaries (tp_weaklistoffset and tp_dictoffset). This is
>> pertaining to the C API.
>>
>> I noticed that when using multiple inheritance, I need a common base
>> class or I get an "instance lay-out conflict" error (my program already
>> deals with the issue of having a varying layout), but the error also
>> happens when the derived classes have these extra dictionaries and the
>> common base doesn't. This doesn't seem like it should be a problem if
>> the offsets for these variables are explicitly specified in the derived
>> types.
>
> No, it is a problem.  It violates Liskov substitution principle.
>
> Let me see if I understand your situation.  You have one base type
> with tp_dictoffset or tp_weakrefoffset set but no extra fields in the
> object structure, and another base type without tp_dictoffset or
> tp_weakrefoffset, but with extra fields, and you're trying to multiply-
> inherit from both?  This is the only case I can think of where the
> layout conflict would be caused by a type setting tp_dictoffset.

No, actually I have code that is roughly equivalent to the following 
pseudocode:

class _internal_class(object):
     __slots__ = ()

class BaseA(_internal_class):
     __slots__ = (some_data,...,__weaklist__,__dict__)

class BaseB(BaseA):
     __slots__ = (some_data,...,__weaklist__,__dict__)

class BaseC(_internal_class):
     __slots__ = (some_other_data,...,__weaklist__,__dict__)

class Derived(BaseB,BaseC):
     __slots__ = (combined_data,...,__weaklist__,__dict__)

Before adding the weaklist and instance dicts, this all worked fine. 
_internal_class was necessary to prevent the "lay-out conflict" error.
But when I added them to every class except _internal_class, I got the 
error. I tried adding the dicts to _internal_class and it worked again. 
(The classes are set up like this because the code is from my unit-test 
code.)

> Even though this is a clear layout conflict, you think that if you set
> tp_dictoffset and tp_weakrefoffset appropiately in the derived type,
> it's ok if the dict and weakreflist appear in different locations in
> the subtype, right?
>
> Not in general.  A bunch of stuff can go wrong.  If any methods in the
> base type reference the object dict directly (rather than indirectly
> via tp_dictoffset), then the derived type will be broken when one of
> the base-type methods is called.  (This alone is enough to rule out
> allowing it in general.)  Even if you are careful to always use
> tp_dictoffset; a user might write a subtype in C that directly
> accesses it, not even stopping to consider that it might be used in
> MI.  It's not even certain that a derived type won't use the base
> type's tp_dictoffset.
>
> The algorithm to detect layout conflicts would require a terrible
> increase in complexity: there's some of layouts that would "work" if
> you could ignore tp_dictoffset, and some that won't, and now you have
> a big mess trying to distinguish.
>
> Bottom line is, tp_dictoffset and tp_weakrefoffset should be treated
> as if they defined regular slots that affect layout, and it should be
> assumed that (like for all other slots) the offset does not change for
> subtypes.  There's a few important situations where tp_dictoffset is a
> tiny bit more flexible than a regular slot, but that's rare.
>
>
>> I want this program to be as flexible as possible, so could
>> someone tell me what exactly are the rules when it comes to these
>> dictionaries and inheritance. Also, I don't like the idea of having up
>> to four different classes (one for every combination of those two
>> variables) that do nothing except tell CPython that I know what I'm doing.
>
> I don't think there's any reasonable way to subvert Python's behavior
> on layout.
>
> If your base types are abstract, you might consider not setting
> tp_dictoffset and tp_weakrefoffset in the base (even if it has methods
> that reference the dict)--just be sure to set it in the first class
> that's concrete.
>
>
>> Also is it possible to have a class that doesn't have these dictionaries
>> derive from a class that does?
>
> Nope.  That would *really* violate Liskov substitution principle.

This is why the code is automatically generated and is not meant to be 
extended by hand. It already does more complicated things. Even without 
inheritance, the layout of the classes can vary. Each Python class is a 
wrapper for a C++ class. The C++ class is kept in the PyObject by copy, 
by default. But it can also be stored by a pointer that is deleted when 
the PyObject is gone. It can be stored as a reference to a member of 
another exposed C++ class, along with a PyObject pointer to that class. 
And it can also be stored as an unmanaged reference (this is intended 
for static objects). On top of that, anything that isn't used, is 
omitted, so that each class can be as small and efficient as possible. 
So I have already thrown internal consistency out the window.

This all works by the way. You can check it out at 
https://github.com/Rouslan/PyExpose (although I haven't uploaded the 
code for supporting the weaklist and instance dictionaries yet). There 
may be a few edge cases where it fails when having very different 
requirements between inherited types, but I plan to write more 
test-cases and work everything out.