Dear all, dear Holger, Thank you for your pointer. It made me move away from the idea that pickle was the issue. Reading the doc more, now that pickle is not the breaking fact, I discovered that iter(node) does not behaves the same way : | | |for node in lxml.Element| iters on children of said Element |for node in lxml.ObjectifiedElement| iters on siblings of said Element Knowing that, I was able to fix my implementation for Objectified ones. This seems like a odd difference between the two and if someone from the team could explain it, I would be eager to know more. Anyway, thanks for your help Best Thibault On 09/19/2016 01:35 PM, Holger Joukl wrote:
Hi,
I have found some severe discrepancies between an objectified tree and its pickle-dump and -loaded cache. It might be reported somewhere but I could not find it. [...] To show the bug in a limited fashion, in a readable form, I made a repo there : https://github.com/PonteIneptique/test-lxml. It runs on the latest 3.6.4 of LXML on Python 3.5 (did not test on Python 2.7). When I loop over the same tree in three forms different (etree.parse, objectify, pickled objectify) the first and the second are alright but the third shows differences really quick (at a node level). I would love to know if this is an "expected" behavior (ie no focus has been set on checking pickled working, which I would totally understand) or if it is an unknown bug. IMHO opinion you're using the wrong parser. lxml.objectify uses its own dedicated XML parser, to support its specialized element lookup.
See lxml.objectify.pyx: https://github.com/lxml/lxml/blob/master/src/lxml/lxml.objectify.pyx#L1735-L...
So you'd rather need to use X = objectify.makeparser() instead of X = XMLParser and then use this for objectify.parse(), etree.parse(), ... for comparison.
I haven't looked too closely but I think pickle support uses objectify's default parser, so only Tree3 is actually an "objectified" tree:
type(Tree1.getroot()) <type 'lxml.etree._Element'> type(Tree2.getroot()) <type 'lxml.etree._Element'> type(Tree3.getroot()) <type 'lxml.objectify.ObjectifiedElement'> In other words you wouldn't want to mix "standard" and "objectified" lxml trees. It's usually a bad idea.
Holger
Landesbank Baden-Wuerttemberg Anstalt des oeffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz HRA 12704 Amtsgericht Stuttgart
_________________________________________________________________ Mailing list for the lxml Python XML toolkit - http://lxml.de/ lxml@lxml.de https://mailman-mail5.webfaction.com/listinfo/lxml