[Python-Dev] Meta-reflections

Kevin Jacobs jacobs@penguin.theopalgroup.com
Mon, 18 Feb 2002 11:29:14 -0500 (EST)

Hello all,

I've been meta-reflecting a lot lately: reflecting on reflection.

My recent post on __slots__ not being picklable (and the resounding lack of
response to it) inspired me to try my hand at channeling Guido and reverse-
engineer some of the design decisions that went into the new-style class
system.  Unfortunately, the more I dug into the code, the more philosophical
my questions became.  So, I've written up some questions that help lay bare
some of basic design questions that I've been asking myself and that you
should be aware of.

While there are several subtle issues I could raise, I do want some feedback
on some simple and fundamental ones first.  Please don't disqualify yourself
from commenting because you haven't read the code or used the new features
yet.  I've written my examples assuming only a basic and cursor
understanding of the new Python 2.2 features.

  [In this discussion I am only going to talk about native Python classes,
   not C-extension or native Python types (e.g., ints, lists, tuples,
   strings, cStringIO, etc.)]

  1) Should class instances explicitly/directly know all of their attributes?

     Before Python 2.2, all object instances contained a __dict__ attribute
     that mapped attribute names to their values.  This made pickling and
     some other reflection tasks fairly easy.


        class Foo:
          def __init__(self):
            self.a = 1
            self.b = 2

        class Bar(Foo):
          def __init__(self):
            self.c = 3

        bar = Bar()
        print bar.__dict__
        > {'a': 1, 'c': 3, 'b': 2}

     I am aware that there are situations where this simple case does not
     hold (e.g., when implementing __setattr__ or __getattr__), but let's
     ignore those for now.  Rather, I will concentrate on how this classical
     Python idiom interacts with the new slots mechanism.  Here is the above
     example using slots:


        class Foo(object):
          __slots__ = ['a','b']
          def __init__(self):
            self.a = 1
            self.b = 2

        class Bar(Foo):
          __slots__ = ['c']
          def __init__(self):
            self.c = 3

        bar = Bar()
        print bar.__dict__
        > AttributeError: 'Bar' object has no attribute '__dict__'

     We can see that the class instance 'bar' has no __dict__ attribute.
     This is because the slots mechanism allocates space for attribute
     storage directly inside the object, and thus does not use (or need) a
     per-object instance dictionary to store attributes.  Of course, it is
     possible to request that a per-instance dictionary by inheriting from a
     new-style class that does not list any slots.  e.g. continuing from

        class Baz(Bar):
          def __init__(self):
            self.d = 4
            self.e = 5

        baz = Baz()
        print baz.__dict__
        > {'e': 5, 'd': 4}

     We have now created a class that has __dict__, but it only contains the
     attributes not stored in slots!  So, should class instances explicitly
     know their attributes?  Or more precisely, should class instances
     always have a __dict__ attribute that contains their attributes?  Don't
     worry, this does not mean that we cannot also have slots, though it
     does have some other implications.  Keep reading...

  2) Should attribute access follow the same resolution order rules as

        class Foo(object):
          __slots__ = ['a']
          def __init__(self):
            self.a = 1

        class Bar(Foo):
          __slots__ = ('a',)
          def __init__(self):
            self.a = 2

        bar = Bar()
        print bar.a
        > 2
        print super(Bar,bar).a   # this doesn't actually work
        > 2 or 1?

    Don't worry -- this isn't a proposal and no, this doesn't actually work.
    However, the current implementation only narrowly escapes this trap:

        print bar.__class__.a.__get__(bar)
        > 2
        print bar.__class__.__base__.a.__get__(bar)
        > AttributeError: a

    Ok, let me explain what just happened.  Slots are implemented via the
    new descriptor interface.  In short, descriptor objects are properties
    and support __get__ and __set__ methods.  The slot descriptors are told
    the offset within an object instance the PyObject* lives and proxy
    operations for them.  So getting and setting slots involves:

        # print bar.a
        a_descr = bar.__class__.a
        print a_descr.__set__(bar)

        # bar.a = 1
        a_descr = bar.__class__.a
        a_descr.__set__(bar, 1)

    So, above we get an attribute error when trying to access the 'a' slot
    from Bar since it was never initialized.  However, with a little
    ugliness you can do the following:

        # Get the descriptors for Foo.a and Bar.a
        a_foo_descr = bar.__class__.__base__.a
        a_bar_descr = bar.__class__.a

        print bar.a
        > 2
        print a_foo_descr.__get__(bar)
        > 1
        print a_bar_descr.__get__(bar)
        > 2

    In other words, the namespace for slots is not really flat, although
    there is no simple way to access these hidden attributes since method
    resolution order rules are not invoked by default.

  3) Should __slots__ be immutable?

     The __slots__ attribute of a new-style class lists all of the slots
     defined by that class.  It is represented as whatever sequence type
     what given when the object was declared:

       print Foo.__slots__
       > ['a']
       print Bar.__slots__
       > ('a',)

     This allows us to do things like:

       foo = Foo()
       foo.b = 42
       > AttributeError: 'Foo' object has no attribute 'b'

     So modifying the slots does not do what one may expect.  This is
     because slot descriptors and the space for slots are only allocated
     when the classes are created (i.e., when they are inherited from
     'object', or from an object that descends from 'object').

  4) Should __slots__ be flat?

     bar.__slots__ only lists the slots specifically requested in bar, even
     though it inherits from 'foo', which has its own slots.  Which would be
     the preferable behavior?

       class Foo(object):
         __slots__ = ('a','b')
       class Bar(object):
         __slots__ = ('c','d')

       print Bar.__slots__
       > ('c','d')           # current behavior
       > ('a','b','c','d')   # alternate behavior

    Clearly, this issue goes back to the ideas addressed in question 1.  If
    slot descriptors are not stored in a per-instance dictionary, then
    the assumptions on how to do object reflection must change.  However,
    which version of the following code do you prefer to print all
    attributes of a given object:

      Old style or if descriptors are stored in obj.__dict__:

        if hasattr(obj,'__dict__'):
          print ''.join([ '%s=%s' % nameval for nameval in obj.__dict__ ])

      Currently in Python 2.2 (and still not quite correct):

        def print_slot_attrs(obj,cls=None):
          if not cls:
            cls = obj.__class__
          for name,obj in cls.__dict__.items()
            if str(type(obj)) == "<type 'member_descriptor'>":
              if hasattr(obj, name):
                print "%s=%s" % (name,getattr(obj, name))
          for base in cls.__bases__:

        if hasattr(obj,'__dict__'):
          print [ '%s=%s' % nameval for nameval in obj.__dict__ ]

      Flat and immutable slot namespace:

        a  = [ '%s=%s' % nameval for nameval in obj.__dict__ ]
        a += [ '%s=%s' % (name,val) for name,val in obj.__slots__ \
                                         if hasattr(obj, name) ]
        print ''.join(a)

  So, which one of these do you want to support or explain to a new user?


Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com