Hi, Höke, Christof wrote:
Von: Ian Bicking [mailto:firstname.lastname@example.org] ::first-letter is hard because it doesn't match any object in lxml. If it returned a string like "A" it would be very much out of context (e.g., no parent pointer), and it would be hard to do anything useful with it. To make it useful I think it would require some new stringish object that also looked nodeish (e.g., had a .getparent() method).
I considered that a while ago, as it would also be interesting for XPath in general. However, currently, we use fast Python string creation functions to serve the API level. At the time I deducted that changing that to the instantiation of a custom string object would almost certainly slow things down and complicate them, just to serve a rather special use case. Although maybe I might want to take another look at that today...
I had a discussion about that starting over at the XML-SIG list last summer. http://permalink.gmane.org/gmane.comp.python.lxml.devel/2763 That was the first time I heard about DOM ranges and when I dug into that a little deeper, I almost ran away screaming. IMVHO, that's an insane and horribly complicated spec.
:first-letter should actually be element.text I guess (which would be a string in lxml currently?)
I don't really know the lxml API but would it be possible to define a subtype for element.text for this case? But you are right, a more general approach would certainly be better.
You could define a special string (and unicode) subtype for the result of an XPath expression, which is determined independent of the Python API level. The freedom is right there. However, it would mean you have to search and (in the worst case) instantiate the parent Element to make sure it won't go away while the string result exists. That's some overhead compared to a simple string creation. As I said, I might reconsider that, but I'm not very confident. Stefan