[lxml-dev] advantages of libxml
data:image/s3,"s3://crabby-images/c86a3/c86a360789541c776343689034f050646c1d6f67" alt=""
Hi! Here are several features I miss in lxml: - ability to get namespace of the node without parsing .tag property (which is concatenated on library side, so double amount of meaningless work is done) - ability to get short name of the node - ability to get parent node without .xpath('../')[0] which seems overkill to me - ability to get absolute xpath of node I may be wrong and some things that I want may be inconsistent with lxml design and usage patterns, correct me in this case. Anyway lxml is the best xml-processing library for Python, thanks for your work!
data:image/s3,"s3://crabby-images/a23c4/a23c4c97fa38b3d3a419be95091a62722a2c6de1" alt=""
Slou wrote:
Here are several features I miss in lxml:
- ability to get namespace of the node without parsing .tag property
(which is concatenated on library side, so double amount of meaningless work is done)
There's a 'prefix' property now in the trunk version. It should be easy enough to add a namespace URI property as well.
- ability to get short name of the node
This would be the local name, right? Shouldn't be too hard to add a localname property as well.
- ability to get parent node without .xpath('../')[0] which seems overkill to me
Yeah, though I'm a bit wary of extending the ElementTree API so fundamentally. Since we have the ability with libxml2, we should exploit it though, I guess.
- ability to get absolute xpath of node
This one I need to think about; I believe libxml2 has a facility for this, but research would need to be done. If you can find out the API in libxml2 and submit a patch, that'd be great!
Thanks for the feedback! I'll consider implementing your suggestions. It shouldn't be too hard to implement a bunch of read-only properties for this. I'm wary to make them writeable, as that might involve DOM-like complexity, but read-only should be simple. The 'getting the xpath expression of a node' requires some more puzzling though; let's talk about this more. Regards, Martijn
data:image/s3,"s3://crabby-images/c86a3/c86a360789541c776343689034f050646c1d6f67" alt=""
Thanks for your response )
ElementTree does not have it cause one element could be subelement of different parents by implementation I guess. libxml2 has strict limitation in that case. may be we can add .find('..') as alternative? (although I would not need that functionality in case we have ability to get absolute xpath)
afaik there is no one-api-function-call way to do it. but I need that strongly and I would try to implement it anyway. may be you can give some tips on it?
data:image/s3,"s3://crabby-images/884f9/884f9d72932149727554a295c668b87736ee21bf" alt=""
Hi, On Tue, 2005-07-05 at 14:26 +0400, Slou wrote:
Something like this (pseudo code, attrs not handled): xpath = "" pos = 1 while node != NULL and node != document: if node.prev == NULL: if xpath != "": xpath = "/" + xpath xpath = "node()[pos]" + xpath node = node.parent pos = 1 else: pos = pos + 1 node = node.prev if xpath != "": xpath = "/" + xpath Regards, Kasimier
data:image/s3,"s3://crabby-images/a23c4/a23c4c97fa38b3d3a419be95091a62722a2c6de1" alt=""
Slou wrote:
Here are several features I miss in lxml:
- ability to get namespace of the node without parsing .tag property
(which is concatenated on library side, so double amount of meaningless work is done)
There's a 'prefix' property now in the trunk version. It should be easy enough to add a namespace URI property as well.
- ability to get short name of the node
This would be the local name, right? Shouldn't be too hard to add a localname property as well.
- ability to get parent node without .xpath('../')[0] which seems overkill to me
Yeah, though I'm a bit wary of extending the ElementTree API so fundamentally. Since we have the ability with libxml2, we should exploit it though, I guess.
- ability to get absolute xpath of node
This one I need to think about; I believe libxml2 has a facility for this, but research would need to be done. If you can find out the API in libxml2 and submit a patch, that'd be great!
Thanks for the feedback! I'll consider implementing your suggestions. It shouldn't be too hard to implement a bunch of read-only properties for this. I'm wary to make them writeable, as that might involve DOM-like complexity, but read-only should be simple. The 'getting the xpath expression of a node' requires some more puzzling though; let's talk about this more. Regards, Martijn
data:image/s3,"s3://crabby-images/c86a3/c86a360789541c776343689034f050646c1d6f67" alt=""
Thanks for your response )
ElementTree does not have it cause one element could be subelement of different parents by implementation I guess. libxml2 has strict limitation in that case. may be we can add .find('..') as alternative? (although I would not need that functionality in case we have ability to get absolute xpath)
afaik there is no one-api-function-call way to do it. but I need that strongly and I would try to implement it anyway. may be you can give some tips on it?
data:image/s3,"s3://crabby-images/884f9/884f9d72932149727554a295c668b87736ee21bf" alt=""
Hi, On Tue, 2005-07-05 at 14:26 +0400, Slou wrote:
Something like this (pseudo code, attrs not handled): xpath = "" pos = 1 while node != NULL and node != document: if node.prev == NULL: if xpath != "": xpath = "/" + xpath xpath = "node()[pos]" + xpath node = node.parent pos = 1 else: pos = pos + 1 node = node.prev if xpath != "": xpath = "/" + xpath Regards, Kasimier
participants (4)
-
Kasimier Buchcik
-
Martijn Faassen
-
Olivier Grisel
-
Slou