[lxml-dev] Pickling objectified trees
Hi, the other day I had to pickle objectified trees. I just thought to share my findings. Pickling is about serialization. IMHO the natural serialization of an objectified tree is its XML representation. So the following basically does that: -------------------------- import copy_reg import lxml.etree import lxml.objectify def treeFactory(state): """Un-Pickle factory.""" return lxml.objectify.fromstring(state) copy_reg.constructor(treeFactory) def reduceObjectifiedElement(object): """Reduce function for lxml.objectify trees. See http://docs.python.org/lib/pickle-protocol.html for details. """ return (treeFactory, (lxml.etree.tostring(object), )) copy_reg.pickle(lxml.objectify.ObjectifiedElement, reduceObjectifiedElement, treeFactory) ----------------------------------------- You might consider just registering the reduce function in lxml itself. Shouldn't hurt, should it. -- Christian Zagrodnick gocept gmbh & co. kg · forsterstrasse 29 · 06112 halle/saale www.gocept.com · fon. +49 345 12298894 · fax. +49 345 12298891
Hi, Christian Zagrodnick wrote:
the other day I had to pickle objectified trees. I just thought to share my findings.
You might consider just registering the reduce function in lxml itself.
Interesting. Sure, why not? Objectify is totally about data classes after all. Applied to the trunk (with small changes). Thanks, Stefan
Hi! On 2007-02-25 15:06:00 +0100, Stefan Behnel <stefan_ml@behnel.de> said:
Christian Zagrodnick wrote:
the other day I had to pickle objectified trees. I just thought to share my findings.
You might consider just registering the reduce function in lxml itself.
Interesting. Sure, why not? Objectify is totally about data classes after all.
Applied to the trunk (with small changes).
I found a may-be-considered-a-bug. The script below shows that when pickling the <cp/> node, the processing instruction is ommited. When trying to pickle the root tree, the error is raised. The problem is, that we pickle the `xml` object and get it back with all its descendants. But after unpickling the whole tree is not as it was before. So that's actually a bug. I guess the best would be to a) always serialize the roottree on pickle and b) remember which part of the tree actually was pickelt so on unpickle this exact object can be restored. I got to play with this a bit before I can deliver some useful code, thogh. Regards, Christian ---------------- import pickle import lxml.etree import lxml.objectify xml = lxml.objectify.fromstring('<cp/><?foo?>') print pickle.dumps(xml) print pickle.dumps(xml.getroottree()) -------------------- ----------------- clxml.objectify fromstring p0 (S'<cp/>' p1 tp2 Rp3 . Traceback (most recent call last): File "pi.py", line 8, in ? print pickle.dumps(xml.getroottree()) File "/Users/zagy/development/python/lib/python2.4/pickle.py", line 1386, in dumps Pickler(file, protocol, bin).dump(obj) File "/Users/zagy/development/python/lib/python2.4/pickle.py", line 231, in dump self.save(obj) File "/Users/zagy/development/python/lib/python2.4/pickle.py", line 313, in save rv = reduce(self.proto) File "/Users/zagy/development/python/lib/python2.4/copy_reg.py", line 69, in _reduce_ex raise TypeError, "can't pickle %s objects" % base.__name__ TypeError: can't pickle _ElementTree objects -------------------------------------- -- Christian Zagrodnick · cz@gocept.com gocept gmbh & co. kg · forsterstraße 29 · 06112 halle (saale) · germany http://gocept.com · tel +49 345 1229889 4 · fax +49 345 1229889 1 Zope and Plone consulting and development
On 2008-06-09 09:50:54 +0200, Christian Zagrodnick <cz@gocept.com> said:
Hi!
On 2007-02-25 15:06:00 +0100, Stefan Behnel <stefan_ml@behnel.de> said:
Christian Zagrodnick wrote:
the other day I had to pickle objectified trees. I just thought to share my findings.
You might consider just registering the reduce function in lxml itself.
Interesting. Sure, why not? Objectify is totally about data classes after all.
Applied to the trunk (with small changes).
I found a may-be-considered-a-bug. The script below shows that when
pickling the <cp/> node, the processing instruction is ommited.
When trying to pickle the root tree, the error is raised.
The problem is, that we pickle the `xml` object and get it back with
all its descendants. But after unpickling the whole tree is not as it
was before. So that's actually a bug.
I guess the best would be to
a) always serialize the roottree on pickle and b) remember which part of the tree actually was pickelt so on unpickle
this exact object can be restored.
I got to play with this a bit before I can deliver some useful code, thogh.
Well for pickling the root node all it really takes is to use the getroottree() for serializing. The fromstring method returns the right object anyway. -- Christian Zagrodnick · cz@gocept.com gocept gmbh & co. kg · forsterstraße 29 · 06112 halle (saale) · germany http://gocept.com · tel +49 345 1229889 4 · fax +49 345 1229889 1 Zope and Plone consulting and development
Hi, Christian Zagrodnick wrote:
Well for pickling the root node all it really takes is to use the getroottree() for serializing. The fromstring method returns the right object anyway.
Not quite. Pickling works nicely through tostring(), but the unpickle process must return an ElementTree and there isn't currently a straight forward unpickle function that takes a string and returns an ElementTree. I'll see how to fix that. Stefan
participants (2)
-
Christian Zagrodnick
-
Stefan Behnel