Currently instances of nested classes can't be pickled. For old style classes unpickling fails to find the class:
import cPickle class Foo:
... class Bar: ... pass ...
cPickle.loads(cPickle.dumps(Foo.Bar()))
Traceback (most recent call last): File "<stdin>", line 1, in ? AttributeError: 'module' object has no attribute 'Bar'
For new style classes, pickling itself fails:
class Foo(object):
... class Bar(object): ... pass ...
cPickle.loads(cPickle.dumps(Foo.Bar()))
Traceback (most recent call last): File "<stdin>", line 1, in ? cPickle.PicklingError: Can't pickle <class '__main__.Bar'>: attribute lookup __main__.Bar failed
I think this should be fixed (see below for use cases). There's an old bug report open for this (http://www.python.org/sf/633930).
Classes would need access to their fully qualified name (starting from the module level) (perhaps in a new class attribute __fullname__?) and the pickle machinery would have to use this name when storing class names instead of __name__.
The second part should be easy to implement. The first part is harder. It can't be done by changing meta classes, because classes are created "inside out", i.e. in:
class Foo: class Bar: class Baz: pass
when Baz class is created, the Bar class hasn't been assigned a name yet. Another problem is that it should only be done for classes *defined* inside other classes, i.e. in
class Foo: pass
class Bar: Baz = Foo
Foo should remain unchanged.
For this I guess the parser would have to be changed to somehow keep a stack of currently "open" class definitions, so the __fullname__ attribute (or dict entry) can be set once a new class statement is encountered.
Of course the remaining interesting question is if this change is worth it and if there are use cases for it. Possible use cases require that:
1) There's a collection of name objects in the scope of a class. 2) These objects are subclasses of other (nested or global) classes. 3) There are instances of those classes.
There are two use cases that immediately come to mind: XML and ORMs.
For XML: 1) Those classes are the element types and the nested classes are the attributes. 2) Being able to define those attributes as separate classes makes it possible to implement custom functionality (e.g. for validation or for handling certain attribute types like URLs, colors etc.) and 3) Those classes get instantiated when an XML tree is created or parsed. A framework that does this (and my main motivation for writing this :)) is XIST (see http://www.livinglogic.de/Python/xist/).
For the ORM case: Each top level class defines a table and the nested classes are the fields, i.e. something like this:
class person(Table): class firstname(Varchar): "The person's first name" null = False class lastname(Varchar): "The person's last name" null = False class password(Varchar): "Login password" def validate(self, value): if not any(c.islower() for c in value) and \ not any(c.isupper() for c in value) and \ not any(c.isdigit() for c in value): raise InvalidField("password requires a mix of upper and lower" "case letters and digits")
Instances of these classes are the records read from the database. A framework that does something similar to this (although AFAIK fields are not nested classes is SQLObject (http://sqlobject.org/)
So is this change wanted? useful? implementable with reasonable effort? Or just not worth it?
Bye, Walter Dörwald
Walter Dörwald wrote:
So is this change wanted? useful? implementable with reasonable effort? Or just not worth it?
I think it is just not worth it. This means I won't attempt to implement it. I think I defined originally the __module__ attribute for classes to support better pickling (and defined it to be a string to avoid cycles); we considered the nested classes case at the time and concluded "oh well, don't do that then".
Regards, Martin
Martin v. Löwis wrote:
Walter Dörwald wrote:
So is this change wanted? useful? implementable with reasonable effort? Or just not worth it?
I think it is just not worth it. This means I won't attempt to implement it. I think I defined originally the __module__ attribute for classes to support better pickling (and defined it to be a string to avoid cycles); we considered the nested classes case at the time and concluded "oh well, don't do that then".
This sounds like this change would have a far greater chance of getting into Python if a patch existed (I guess, this is the case for all changes ;)).
Bye, Walter Dörwald
Walter Dörwald wrote:
For XML: 1) Those classes are the element types and the nested classes are the attributes. 2) Being able to define those attributes as separate classes makes it possible to implement custom functionality (e.g. for validation or for handling certain attribute types like URLs, colors etc.) and 3) Those classes get instantiated when an XML tree is created or parsed. A framework that does this (and my main motivation for writing this :)) is XIST (see http://www.livinglogic.de/Python/xist/).
For the ORM case: Each top level class defines a table and the nested classes are the fields, i.e. something like this:
class person(Table): class firstname(Varchar): "The person's first name" null = False class lastname(Varchar): "The person's last name" null = False class password(Varchar): "Login password" def validate(self, value): if not any(c.islower() for c in value) and \ not any(c.isupper() for c in value) and \ not any(c.isdigit() for c in value): raise InvalidField("password requires a mix of upper and lower" "case letters and digits")
Instances of these classes are the records read from the database. A framework that does something similar to this (although AFAIK fields are not nested classes is SQLObject (http://sqlobject.org/)
So is this change wanted? useful? implementable with reasonable effort? Or just not worth it?
notice that in this cases often metaclasses are involved or could easely be, so if pickling would honor __reduce__ or __reduce_ex__ on metaclasses (which right now it doesn't treating their instances as normal classes) one could roll her own solution without the burden for the language of implementing pickling of nested classes in general, so I think that would make more sense, to add support to honor __reduce__/__reduce_ex__ for metaclasses.
Samuele Pedroni wrote:
Walter Dörwald wrote:
[User cases for pickling instances of nested classes] So is this change wanted? useful? implementable with reasonable effort? Or just not worth it?
notice that in this cases often metaclasses are involved or could easely be, so if pickling would honor __reduce__ or __reduce_ex__ on metaclasses (which right now it doesn't treating their instances as normal classes) one could roll her own solution without the burden for the language of implementing pickling of nested classes in general, so I think that would make more sense, to add support to honor __reduce__/__reduce_ex__ for metaclasses.
Sorry, I don't understand: In most cases it can be possible to work around the nested classes problem by implementing custom pickling functionality (getstate/setstate/reduce/reduce_ex). But it is probably impossible to implement this once and for all in a common base class, because there's no way to find the real name of the nested class (or any other handle that makes it possible to retrieve the class from the module on unpickling).
And having the full name of the class available would certainly help in debugging.
Bye, Walter Dörwald
Walter Dörwald wrote:
Samuele Pedroni wrote:
Walter Dörwald wrote:
[User cases for pickling instances of nested classes] So is this change wanted? useful? implementable with reasonable effort? Or just not worth it?
notice that in this cases often metaclasses are involved or could easely be, so if pickling would honor __reduce__ or __reduce_ex__ on metaclasses (which right now it doesn't treating their instances as normal classes) one could roll her own solution without the burden for the language of implementing pickling of nested classes in general, so I think that would make more sense, to add support to honor __reduce__/__reduce_ex__ for metaclasses.
Sorry, I don't understand: In most cases it can be possible to work around the nested classes problem by implementing custom pickling functionality (getstate/setstate/reduce/reduce_ex). But it is probably impossible to implement this once and for all in a common base class, because there's no way to find the real name of the nested class (or any other handle that makes it possible to retrieve the class from the module on unpickling).
And having the full name of the class available would certainly help in debugging.
that's probably the only plus point but the names would be confusing wrt modules vs. classes.
My point was that enabling reduce hooks at the metaclass level has propably other interesting applications, is far less complicated than your proposal to implement, it does not further complicate the notion of what happens at class creation time, and indeed avoids the implementation costs (for all python impls) of your proposal and still allows fairly generic solutions to the problem at hand because the solution can be formulated at the metaclass level.
If pickle.py is patched along these lines [*] (strawman impl, not much tested but test_pickle.py still passes, needs further work to support __reduce_ex__ and cPickle would need similar changes) then this example works:
class HierarchMeta(type): """metaclass such that inner classes know their outer class, with pickling support""" def __new__(cls, name, bases, dic): sub = [x for x in dic.values() if isinstance(x,HierarchMeta)] newtype = type.__new__(cls, name, bases, dic) for x in sub: x._outer_ = newtype return newtype
def __reduce__(cls): if hasattr(cls, '_outer_'): return getattr, (cls._outer_, cls.__name__) else: return cls.__name__
# uses the HierarchMeta metaclass class Elm: __metaclass__ = HierarchMeta
def __init__(self, **stuff): self.__dict__.update(stuff)
def __repr__(self): return "<%s %s>" % (self.__class__.__name__, self.__dict__)
# example class X(Elm): class Y(Elm): pass
class Z(Elm): pass
import pickle
x = X(a=1) y = X.Y(b=2) z = X.Z(c=3)
xs = pickle.dumps(x) ys = pickle.dumps(y) zs = pickle.dumps(z)
print pickle.loads(xs) print pickle.loads(ys) print pickle.loads(zs)
pedronis$ python2.4 example.py <X {'a': 1}> <Y {'b': 2}> <Z {'c': 3}>
[*]: --- pickle.py.orig Wed Mar 30 20:37:14 2005 +++ pickle.py Thu Mar 31 21:09:41 2005 @@ -298,12 +298,19 @@ issc = issubclass(t, TypeType) except TypeError: # t is not a class (old Boost; see SF #502085) issc = 0 + reduce = None if issc: - self.save_global(obj) - return + for x in t.__mro__: + if x is not object and '__reduce__' in x.__dict__: + reduce = x.__dict__['__reduce__'] + break + else: + self.save_global(obj) + return
# Check copy_reg.dispatch_table - reduce = dispatch_table.get(t) + if not reduce: + reduce = dispatch_table.get(t) if reduce: rv = reduce(obj) else:
Samuele Pedroni wrote:
[...] And having the full name of the class available would certainly help in debugging.
that's probably the only plus point but the names would be confusing wrt modules vs. classes.
You'd propably need a different separator in repr. XIST does this:
from ll.xist.ns import html html.a.Attrs.href
<attribute class ll.xist.ns.html:a.Attrs.href at 0x8319284>
My point was that enabling reduce hooks at the metaclass level has propably other interesting applications, is far less complicated than your proposal to implement, it does not further complicate the notion of what happens at class creation time, and indeed avoids the implementation costs (for all python impls) of your proposal and still allows fairly generic solutions to the problem at hand because the solution can be formulated at the metaclass level.
Pickling classes like objects (i.e. by using the pickling methods in their (meta-)classes) solves only the second part of the problem: Finding the nested classes in the module on unpickling. The other problem is to add additional info to the inner class, which gets pickled and makes it findable on unpickling.
If pickle.py is patched along these lines [*] (strawman impl, not much tested but test_pickle.py still passes, needs further work to support __reduce_ex__ and cPickle would need similar changes) then this example works:
class HierarchMeta(type): """metaclass such that inner classes know their outer class, with pickling support""" def __new__(cls, name, bases, dic): sub = [x for x in dic.values() if isinstance(x,HierarchMeta)]
I did something similar to this in XIST, but the problem with this approach is that in:
class Foo(Elm): pass
class Bar(Elm): Baz = Foo
the class Foo will get its _outer_ set to Bar although it shouldn't.
[...] def __reduce__(cls): if hasattr(cls, '_outer_'): return getattr, (cls._outer_, cls.__name__) else: return cls.__name__
I like this approach: Instead of hardcoding how references to classes are pickled (pickle the __name__), deligate it to the metaclass.
BTW, if classes and functions are pickable, why aren't modules:
import urllib, cPickle cPickle.dumps(urllib.URLopener)
'curllib\nURLopener\np1\n.'
cPickle.dumps(urllib.splitport)
'curllib\nsplitport\np1\n.'
cPickle.dumps(urllib)
Traceback (most recent call last): File "<stdin>", line 1, in ? File "/usr/local/lib/python2.4/copy_reg.py", line 69, in _reduce_ex raise TypeError, "can't pickle %s objects" % base.__name__ TypeError: can't pickle module objects
We'd just have to pickle the module name.
Bye, Walter Dörwald
Walter Dörwald wrote:
Samuele Pedroni wrote:
[...] And having the full name of the class available would certainly help in debugging.
that's probably the only plus point but the names would be confusing wrt modules vs. classes.
You'd propably need a different separator in repr. XIST does this:
from ll.xist.ns import html html.a.Attrs.href
<attribute class ll.xist.ns.html:a.Attrs.href at 0x8319284>
My point was that enabling reduce hooks at the metaclass level has propably other interesting applications, is far less complicated than your proposal to implement, it does not further complicate the notion of what happens at class creation time, and indeed avoids the implementation costs (for all python impls) of your proposal and still allows fairly generic solutions to the problem at hand because the solution can be formulated at the metaclass level.
Pickling classes like objects (i.e. by using the pickling methods in their (meta-)classes) solves only the second part of the problem: Finding the nested classes in the module on unpickling. The other problem is to add additional info to the inner class, which gets pickled and makes it findable on unpickling.
If pickle.py is patched along these lines [*] (strawman impl, not much tested but test_pickle.py still passes, needs further work to support __reduce_ex__ and cPickle would need similar changes) then this example works:
class HierarchMeta(type): """metaclass such that inner classes know their outer class, with pickling support""" def __new__(cls, name, bases, dic): sub = [x for x in dic.values() if isinstance(x,HierarchMeta)]
I did something similar to this in XIST, but the problem with this approach is that in:
class Foo(Elm): pass
class Bar(Elm): Baz = Foo
the class Foo will get its _outer_ set to Bar although it shouldn't.
this should approximate that behavior better: [not tested]
import sys
.... def __new__(cls, name, bases, dic): sub = [x for x in dic.values() if isinstance(x,HierarchMeta)] newtype = type.__new__(cls, name, bases, dic) for x in sub: if not hasattr(x, '_outer_') and getattr(sys.modules.get(x.__module__), x.__name__, None) is not x: x._outer_ = newtype return newtype
.....
we don't set _outer_ if a way to pickle the class is already there
Samuele Pedroni wrote:
[...]
this should approximate that behavior better: [not tested]
import sys
.... def __new__(cls, name, bases, dic): sub = [x for x in dic.values() if isinstance(x,HierarchMeta)] newtype = type.__new__(cls, name, bases, dic) for x in sub: if not hasattr(x, '_outer_') and getattr(sys.modules.get(x.__module__), x.__name__, None) is not x: x._outer_ = newtype return newtype
.....
we don't set _outer_ if a way to pickle the class is already there
This doesn't fix
class Foo: class Bar: pass
class Baz: Bar = Foo.Bar
both this should be a simple fix.
Bye, Walter Dörwald