Steve Howe wrote:
Friday, May 26, 2006, 5:00:08 AM, you wrote:
IMHO, the only two reasons why these three functions are there are
1) they are ET compatible 2) they are simple
We had the discussion pop up a few times if implementing findall() through xpath() would be a good idea. It was generally agreed (and demonstrated in code) that this would too easily break ET compatibility, which was not considered worth it. Ok, reason is compatibility. Two points:
1) Shouldn't it be clearly documented ?
Well, regarding documentation, lxml has (inofficially) always said: "we let Fredrik write the documentation, and only if we must (or want to) do it different, we document it ourselves." ElementTree's find*() methods are documented, so all we add is "lxml supports full XPath expressions through the xpath() function".
2) Since xpath() supports a superset of the expressions findall() does, isn't the compatibility ensured ?
No, it's not a superset at all. findall() uses '{namespace}tag' notation, which is absolutely invalid in XPath. lxml has an ETXPath class that allows you to do this, but calling that for the general XPath case is just overhead, as we would still be trying to extract namespaces from it instead of passing it straight into libxml2's parser.
It makes no sense to cripple etreeĀ“s findall() in order to to support only what ET's findall() does.
It wouldn't make sense if it wasn't for compatibility. Currently, you can exchange code between lxml, ElementTree and cElementTree with relatively little extra consideration. And I mean in all directions. Making more functions incompatible (without convincing reasons) is just calling for trouble. ("he, lxml didn't raise an exception on this expression!!") The reasons for leaving it as is are: 1) it works 2) it is 100% compatible now and trivial to keep compatible 3) it is not trivial to reimplement without breaking compatibility 4) it makes things slower to change it, as it requires parsing the expression twice (once in lxml, once in libxml2) and it's not faster to evaluate it. The reasons to change it are: 1) it supports different expressions than xpath(), which is documented (although perhaps not clearly so) and the reason why there is an xpath() method. Honestly, unless there are good reasons to do it, I'm absolutely +1 for keeping the current state. Stefan