[lxml-dev] type of custom objects in XML-tree disappears

Thanks for your fast response - with your help I was able to make my implementation work, but on the design side I'm not quite satisfied with the solution. You wrote "You do not have to call set_element_class_lookup() each time as it sticks with the parser." ... - but thats exactly what I want to do: I need a tree with objects of different(!) classes. And I want to put additional data in that objects. Yes, I read this one: "There is one thing to know up front. Element classes must not have a constructor, neither must there be any internal state (except for the data stored in the underlying XML tree). Element instances are created and garbage collected at need, so there is no way to predict when and how often a constructor would be called." on http://codespeak.net/lxml/element_classes.html So for me, it seems that lxml seems not to be designed to manage objects of different classes in an XML tree. With a hack (calling setElementClassLookup() before each creation of an element) I'm able to create such tree's and for some testcases it seems to work fine - despite of these nasty things I reported. But it's not quite satisficing: - it's not free of side-effects, when I change the default setElementClassLookup ... - I'm afraid to run in garbage collection bugs, like the one I reported. Finally my question: is it possible, that lxml supports that feature officially? For example by providing an explicit factory call like etree.createElement(class)? With best regards Markus

Hi, Markus Hillebrand wrote:
You wrote "You do not have to call set_element_class_lookup() each time as it sticks with the parser." ... - but thats exactly what I want to do: I need a tree with objects of different(!) classes.
That's perfectly fine, and there are ways to do that. You can write your own lookup scheme based on XML attributes, namespace/tag, some general element information or even full-fledged tree traversal. http://codespeak.net/lxml/dev/element_classes.html What will /not/ work is: merge elements from different trees into a tree that has a different lookup scheme and then have them reappear in the new tree with their original class - *except* if you keep Python references to each object, which will prevent them from being garbage collected and thus from re-evaluating the lookup on access. But you have to take care in this case that tree modifications are reflected in the cache.
And I want to put additional data in that objects.
You can do that as long as it is reflected in the underlying XML (e.g. through attributes in a separate namespace). lxml.objectify does this for type annotations, for example. You can /not/ do that if you want to keep the state in the Python objects - again, with the exception of keeping the Python objects alive.
So for me, it seems that lxml seems not to be designed to manage objects of different classes in an XML tree.
It totally is, it just depends on how sophisticated your lookup scheme is.
With a hack (calling setElementClassLookup() before each creation of an element)
You are assuming here that you can keep state in the Element objects, which in this case means: their Python type.
I'm able to create such tree's and for some testcases it seems to work fine - despite of these nasty things I reported. But it's not quite satisficing:
- it's not free of side-effects, when I change the default setElementClassLookup ...
Which is discouraged anyway, but helpful in some I-know-what-I'm-doing cases where you are sure you're the only one to play with this.
Finally my question: is it possible, that lxml supports that feature officially? For example by providing an explicit factory call like etree.createElement(class)?
No, lxml will not keep state in its Element proxies. But again, objectify uses something similar: it determines the Python type of an element value (string, int, ...) and stores it as a namespaced attribute. When it has to determine the Element class to use for such an element, it uses that information in the class lookup. When serialising, you can choose to either keep these attributes in or to "deannotate()" the tree first. Stefan
participants (2)
-
Markus Hillebrand
-
Stefan Behnel