
Hi Steve, Steve Howe wrote:
Go ahead, try, using KCachegrind is pure fun! :)
Do you have any results (or impressions) on this ?
I didn't check, but I don't think it suffers so much from Python performance. As Fredrik said, cElementTree builds Python objects on the way in, so all you should see when /accessing/ data is Python's call overhead rather than any substantial calculations. I think that's totally the right optimization, but it is difficult to do something similar in lxml, since we also get entire trees from the parser. It wouldn't be a good idea to traverse them to build Python objects - we don't even know if they would be used. All we could do is cache Python objects once they were built. The Proxy mechanism would be the right place to keep references to text and tag objects. Also, you could to change the current way Python element proxies are deallocated to keep them alive as long as any of them is really used. But that's non-trivial. Anyway, to make me implement that, I would really have to be convinced that it's worth it - and I absolutely don't see enough of a speed-up behind these optimizations to encourage such a huge effort. Especially the text and tag properties are bound by call overhead, not by object creation time. Stefan
participants (1)
-
Stefan Behnel