Dear all, dear Holger,

Thank you for your pointer. It made me move away from the idea that pickle was the issue. Reading the doc more, now that pickle is not the breaking fact, I discovered that iter(node) does not behaves the same way :

for node in lxml.Element iters on children of said Element
for node in lxml.ObjectifiedElement iters on siblings of said Element

Knowing that, I was able to fix my implementation for Objectified ones. This seems like a odd difference between the two and if someone from the team could explain it, I would be eager to know more.

Anyway, thanks for your help

Best

Thibault


On 09/19/2016 01:35 PM, Holger Joukl wrote:
Hi,

I have found some severe discrepancies between an objectified tree and
its pickle-dump and -loaded cache. It might be reported somewhere but I
could not find it.
[...]
To show the bug in a limited fashion, in a readable form, I made a repo
there : https://github.com/PonteIneptique/test-lxml. It runs on the
latest 3.6.4 of LXML on Python 3.5 (did not test on Python 2.7). When I
loop over the same tree in three forms different (etree.parse,
objectify, pickled objectify) the first and the second are alright but
the third shows differences really quick (at a node level). I would love
to know if this is an "expected" behavior (ie no focus has been set on
checking pickled working, which I would totally understand) or if it is
an unknown bug.
IMHO opinion you're using the wrong parser. lxml.objectify uses its
own dedicated XML parser, to support its specialized element lookup.

See lxml.objectify.pyx:
https://github.com/lxml/lxml/blob/master/src/lxml/lxml.objectify.pyx#L1735-L1744


So you'd rather need to use X = objectify.makeparser() instead of X =
XMLParser
and then use this for objectify.parse(), etree.parse(), ... for comparison.

I haven't looked too closely but I think pickle support uses objectify's
default
parser, so only Tree3 is actually an "objectified" tree:


type(Tree1.getroot())
<type 'lxml.etree._Element'>
type(Tree2.getroot())
<type 'lxml.etree._Element'>
type(Tree3.getroot())
<type 'lxml.objectify.ObjectifiedElement'>

          
In other words you wouldn't want to mix "standard" and "objectified" lxml
trees.
It's usually a bad idea.

Holger

Landesbank Baden-Wuerttemberg
Anstalt des oeffentlichen Rechts
Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz
HRA 12704
Amtsgericht Stuttgart

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml@lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml