[lxml-dev] [PATCH] element parent access through Element.getparent()

Hi! Here's a little patch that adds a getparent() method to elements. It is missing from the ElementTree API, but libxml (obviously) provides it, so its trivial to expose it to the public API. Hoping for integration, Stefan

Stefan Behnel wrote:
Hi, I second this request! Having access to the parent node is very useful if you're replacing nodes from a tree, as you have to go up to the parent before you can replace the child. The patch below is the same patch, but it provides the parent as a property. This approach is consistent with the already existing properties such a .tag and .attrib. It is however inconsistent with .getroot() and .getchildren(). Martijn please take your pick ;-) Regards, Geert Index: src/lxml/etree.pyx =================================================================== --- src/lxml/etree.pyx (revision 18695) +++ src/lxml/etree.pyx (working copy) @@ -493,6 +493,12 @@ c_ns = self._getNs(ns) tree.xmlSetNs(self._c_node, c_ns) + property parent: + def __get__(self): + if self._c_node.parent.type == tree.XML_DOCUMENT_NODE: + return + return _elementFactory(self._doc, self._c_node.parent) + # not in ElementTree, read-only property prefix: def __get__(self):

Geert Jansen wrote:
You may notice that the methods getchildren() and getroot() return nodes, while the properties tag and attrib return properties of the element itself. I think the decision is obvious. Stefan

Geert Jansen wrote:
Hm, I missed this discussion. I submitted a ``parent`` property as I think this is a bit easier to work with when programming, but the consistency argument makes sense too. I'll ponder it over and may end up changing the trunk... Regards, Martijn

On Thu, 2005-11-03 at 19:05 +0100, Martijn Faassen wrote:
And my two cents. I think that get*() functions are so un-pythonic, that the decision should be obvious. I think that the real need in functions is when result is variable over time or when parameter is needed or some heavy calculations are included. .parent is almost constant and neither it need additional parameters, nor it heavy in calculations. also I think that getchildren() and getroot() should be properties too, but these are for ElementTree compatibility as I understand.

Andrey Tatarinov wrote:
Why, because their name tells you what they do? Explicit is better than implicit. Getter functions are not bad by definition. They tell you that there is something to get that is not changeable in the same way.
There actually is a bit of work involved. You may argue that there is more work involved in calculating the attrib property, but again, that's what the ElementTree-API gives you.
also I think that getchildren() and getroot() should be properties too, but these are for ElementTree compatibility as I understand.
So, just to get you wrong. What you say is: The ElementTree API is un-pythonic and getchildren(), getroot() and getparent() should be replaced by properties if it wasn't for compatibility. Ok, but that's the API people have used for a while and if compatibility is the goal, it must be kept the way it is. I admit one thing: The current problems with compatibility (related to the fact that ElementTree is a document and that Elements are not free between documents) is so big that current ElementTree code is hard to port to lxml anyway. So backward compatibility is currently not much of an issue. Ok, for a sum-up. I agree that a property whould do here, but the parent property is not currently settable (which is what getparent() explicitly tells you). If it was made settable (and maybe deletable), I'd perfectly second the proposal of making it a property. That could also help in getting around some of the weird results of moving elements between documents: If you set the parent property, it's set as a property of the Element. No other parents allowed. :) I don't know if it is a good idea to actually make it settable, but that's to see. If someone can bring up evidence that making it settable makes sense, I'll start supporting the parent property. Stefan

Stefan Behnel wrote:
True, a getter function makes it _more_ explicit that something is not writable. But I disagree on the converse, i.e. that a property should imply something is writable . That's why Python supports separate getter, setter and delete functions for them, any of which can be missing.
As you mention, there are already quite some areas where lxml is not a drop-in replacement for elementtree (e.g. when moving nodes between documents). Therefore, any project moving from elementree to lxml should take a careful look at their code anyway. In my view, compatiblity is nice but should be second to general usability and ease-of-use. Also, if 100% compatiblity is the goal, then as the project moves along and more people start using it there will be the support burden of living up to this expectation. The wonderful thing about elementree is its data model. Whether something is called 'parent' or 'getparent()' is in my view immaterial to this. Regards, Geert

Stefan Behnel wrote:
I agree; thanks for the patch (there are other patches doing the same also around, but hadn't gotten around to it yet). Please add a few unit tests to your patch though -- adding a new method or attribute on the tree require tests. Thanks again! Regards, Martijn

Martijn Faassen schrieb:
Sorry, I only just found the ETreeOnlyTestCase class. The attached patch now implements the test there (which I think is correct as ElementTree does not provide a getparent()). I will also add test cases for the new APIs that I wrote as soon as someone considers them worth integrating. Note, however, that I'm already using them in an application, so the major bugs should be gone by now. Stefan

Stefan Behnel wrote:
Hi, I second this request! Having access to the parent node is very useful if you're replacing nodes from a tree, as you have to go up to the parent before you can replace the child. The patch below is the same patch, but it provides the parent as a property. This approach is consistent with the already existing properties such a .tag and .attrib. It is however inconsistent with .getroot() and .getchildren(). Martijn please take your pick ;-) Regards, Geert Index: src/lxml/etree.pyx =================================================================== --- src/lxml/etree.pyx (revision 18695) +++ src/lxml/etree.pyx (working copy) @@ -493,6 +493,12 @@ c_ns = self._getNs(ns) tree.xmlSetNs(self._c_node, c_ns) + property parent: + def __get__(self): + if self._c_node.parent.type == tree.XML_DOCUMENT_NODE: + return + return _elementFactory(self._doc, self._c_node.parent) + # not in ElementTree, read-only property prefix: def __get__(self):

Geert Jansen wrote:
You may notice that the methods getchildren() and getroot() return nodes, while the properties tag and attrib return properties of the element itself. I think the decision is obvious. Stefan

Geert Jansen wrote:
Hm, I missed this discussion. I submitted a ``parent`` property as I think this is a bit easier to work with when programming, but the consistency argument makes sense too. I'll ponder it over and may end up changing the trunk... Regards, Martijn

On Thu, 2005-11-03 at 19:05 +0100, Martijn Faassen wrote:
And my two cents. I think that get*() functions are so un-pythonic, that the decision should be obvious. I think that the real need in functions is when result is variable over time or when parameter is needed or some heavy calculations are included. .parent is almost constant and neither it need additional parameters, nor it heavy in calculations. also I think that getchildren() and getroot() should be properties too, but these are for ElementTree compatibility as I understand.

Andrey Tatarinov wrote:
Why, because their name tells you what they do? Explicit is better than implicit. Getter functions are not bad by definition. They tell you that there is something to get that is not changeable in the same way.
There actually is a bit of work involved. You may argue that there is more work involved in calculating the attrib property, but again, that's what the ElementTree-API gives you.
also I think that getchildren() and getroot() should be properties too, but these are for ElementTree compatibility as I understand.
So, just to get you wrong. What you say is: The ElementTree API is un-pythonic and getchildren(), getroot() and getparent() should be replaced by properties if it wasn't for compatibility. Ok, but that's the API people have used for a while and if compatibility is the goal, it must be kept the way it is. I admit one thing: The current problems with compatibility (related to the fact that ElementTree is a document and that Elements are not free between documents) is so big that current ElementTree code is hard to port to lxml anyway. So backward compatibility is currently not much of an issue. Ok, for a sum-up. I agree that a property whould do here, but the parent property is not currently settable (which is what getparent() explicitly tells you). If it was made settable (and maybe deletable), I'd perfectly second the proposal of making it a property. That could also help in getting around some of the weird results of moving elements between documents: If you set the parent property, it's set as a property of the Element. No other parents allowed. :) I don't know if it is a good idea to actually make it settable, but that's to see. If someone can bring up evidence that making it settable makes sense, I'll start supporting the parent property. Stefan

Stefan Behnel wrote:
True, a getter function makes it _more_ explicit that something is not writable. But I disagree on the converse, i.e. that a property should imply something is writable . That's why Python supports separate getter, setter and delete functions for them, any of which can be missing.
As you mention, there are already quite some areas where lxml is not a drop-in replacement for elementtree (e.g. when moving nodes between documents). Therefore, any project moving from elementree to lxml should take a careful look at their code anyway. In my view, compatiblity is nice but should be second to general usability and ease-of-use. Also, if 100% compatiblity is the goal, then as the project moves along and more people start using it there will be the support burden of living up to this expectation. The wonderful thing about elementree is its data model. Whether something is called 'parent' or 'getparent()' is in my view immaterial to this. Regards, Geert

Stefan Behnel wrote:
I agree; thanks for the patch (there are other patches doing the same also around, but hadn't gotten around to it yet). Please add a few unit tests to your patch though -- adding a new method or attribute on the tree require tests. Thanks again! Regards, Martijn

Martijn Faassen schrieb:
Sorry, I only just found the ETreeOnlyTestCase class. The attached patch now implements the test there (which I think is correct as ElementTree does not provide a getparent()). I will also add test cases for the new APIs that I wrote as soon as someone considers them worth integrating. Note, however, that I'm already using them in an application, so the major bugs should be gone by now. Stefan
participants (4)
-
Andrey Tatarinov
-
Geert Jansen
-
Martijn Faassen
-
Stefan Behnel