Stefan Behnel <stefan_ml@behnel.de> writes:
Dieter Maurer, 23.06.2012 12:02: ...
I propose that future `lxml` versions should include a public `safe_release` function for such purposes.
Maybe a new "removeNodeFromDocument()" API function could first check for proxies, and then either deallocate or fix up the tree to be stand-alone.
That would be ideal.
...
Another, but less serious problem: some `libxmlsec` functions replace a node inside the tree (e.g. a node is replaced by an `EncryptedData` node representing the node in an encrypted form). It would be nice if I could "retarget" an `lxml` proxy referencing the replaced node to point to the replacing node. This way, `lxml` objects with references to the proxy would see the new state rather then the confusing picture resulting from the proxy now refering to an unlinked node. ... Of course, the "retarget"ing is not trivial. It is not sufficient to give the proxy a new "_c_node"; its class, too, might need to be adapted. This were possible as long as the two classes had the same "C" layout for their objects. Is `lxml` supposed to support proxy classes with differing "C" layout (I expect "yes" as answer).
From the POV of lxml the proxy is just a reference to an object of type (or subtype of) _Element. The problem is that the user most likely holds another reference to it
This means, one cannot replace the proxy object by a new one but one could change the proxy object content (e.g. set a new "_c_node", set a new "__class__"). As I understood, "lxml" ensures that there is at most one proxy for any given "c_node" (by putting a proxy reference into the "_private" of the "c_node"). Thereby, changing the proxy content changes all "views" of the "lxml" application on the respectice "c_node".
and there is no way we can exchange the object (or even its class) that that reference points to. These things are a lot less trivial at the C level than in Python (and even there they can have surprising side effects).
I am not sure that I understand your argument (though I fully appreciate your reluctance to provide a public API). In my case, I am not inside a complicated `lxml` context where `lxml` code could hold direct references to internal attributes of the proxy I want to retarget. The only such references are in my binding function -- and of course, I must ensure that they do not get confused.
For the moment, I will tell the user of my `libxmlsec` binding: forget any `lxml` reference into an encrypted or decrypted document, including a reference to its root tree and always rebuild references from the operation's return value.
Basically, what this means is that Elements that the user holds a reference to won't change during the transformation but may no longer be at their original place afterwards.
The worst behaviour I have observed: doc = parse(StringIO("<?...><!-- ... --><Envelope>...</Envelope>")) encrypt(..., doc.getroot()) print tostring(doc) <Envelope>...</Envelope> That means that encrypting the root node of an "_ElementTree" has stripped this tree of its processing instruction and its comment. I understand why this happens but from a user perspective, it can be really surprising.
Perfectly reasonable if you ask me, because changing the tree is the whole point of doing that transformation. The same happens in XInclude, for example. Or even just when you change the tag name of an Element. None of those cases replaces the implementation of an Element that the user holds. After all, he or she could still need the original Element for some reason.
As the example above shows, he neither sees the original nor the new element.