[Python-Dev] Pickling instances of nested classes

Walter Dörwald walter at livinglogic.de
Tue Mar 29 23:41:16 CEST 2005


Currently instances of nested classes can't be pickled. For old style classes
unpickling fails to find the class:

>>> import cPickle
>>> class Foo:
...    class Bar:
...       pass
...
>>> cPickle.loads(cPickle.dumps(Foo.Bar()))
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'module' object has no attribute 'Bar'

For new style classes, pickling itself fails:

>>> class Foo(object):
...    class Bar(object):
...       pass
...
>>> cPickle.loads(cPickle.dumps(Foo.Bar()))
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
cPickle.PicklingError: Can't pickle <class '__main__.Bar'>: attribute lookup __main__.Bar failed

I think this should be fixed (see below for use cases). There's an old bug
report open for this (http://www.python.org/sf/633930).

Classes would need access to their fully qualified name (starting from the
module level) (perhaps in a new class attribute __fullname__?) and the pickle
machinery would have to use this name when storing class names instead of
__name__.

The second part should be easy to implement. The first part is harder. It can't
be done by changing meta classes, because classes are created "inside out", i.e.
in:

class Foo:
   class Bar:
      class Baz:
         pass

when Baz class is created, the Bar class hasn't been assigned a name yet.
Another problem is that it should only be done for classes *defined* inside
other classes, i.e. in

class Foo:
   pass

class Bar:
   Baz = Foo

Foo should remain unchanged.

For this I guess the parser would have to be changed to somehow keep a stack of
currently "open" class definitions, so the __fullname__ attribute (or dict
entry) can be set once a new class statement is encountered.


Of course the remaining interesting question is if this change is worth it and
if there are use cases for it. Possible use cases require that:

1) There's a collection of name objects in the scope of a class.
2) These objects are subclasses of other (nested or global) classes.
3) There are instances of those classes.

There are two use cases that immediately come to mind: XML and ORMs.

For XML: 1) Those classes are the element types and the nested classes
are the attributes. 2) Being able to define those attributes as separate
classes makes it possible to implement custom functionality (e.g. for
validation or for handling certain attribute types like URLs, colors etc.)
and 3) Those classes get instantiated when an XML tree is created or parsed.
A framework that does this (and my main motivation for writing this :)) is
XIST (see http://www.livinglogic.de/Python/xist/).

For the ORM case: Each top level class defines a table and the nested
classes are the fields, i.e. something like this:

class person(Table):
	class firstname(Varchar):
		"The person's first name"
		null = False
	class lastname(Varchar):
		"The person's last name"
		null = False
	class password(Varchar):
		"Login password"
		def validate(self, value):
			if not any(c.islower() for c in value) and \
			   not any(c.isupper() for c in value) and \
			   not any(c.isdigit() for c in value):
				raise InvalidField("password requires a mix of upper and lower"
				                   "case letters and digits")

Instances of these classes are the records read from the database. A framework
that does something similar to this (although AFAIK fields are not nested
classes is SQLObject (http://sqlobject.org/)


So is this change wanted? useful? implementable with reasonable effort? Or
just not worth it?

Bye,
   Walter Dörwald





More information about the Python-Dev mailing list