[lxml-dev] nscleanup branch merged: better namespace handling in lxml

Hi,
I finally found the time to take a second look back at the nscleanup branch. I found that the reason for one of the test failing was not even related to the changes on the branch, so I just merged it into the trunk. So, now lxml has its own implementation for namespace cleanup when moving elements between trees. The main problem that this is meant to solve is the redundant redeclaration of namespaces that already exist in the target tree. This should now be avoided. I also expect it to be faster than the previous version - although I haven't done the benchmarks yet to prove it.
So, please, everyone who had problems with this kind of bug in the past: please check if this problem is gone for your application. And everyone else who wants to help out: please check out the current trunk, build it and test it with your application to see if it still works as expected. This is a change in a rather critical place, so I'd like to have it tested before releasing it to the masses. I'm planning to release a beta version of 1.3 soon, so that it becomes easier to test. But I'd be happy to have some feedback on this before hand.
Have fun, Stefan

Hi again,
Stefan Behnel wrote:
I finally found the time to take a second look back at the nscleanup branch. [...] I also expect it to be faster than the previous version - although I haven't done the benchmarks yet to prove it.
I did some now. It looks like most benchmarks for objectify get faster compared to 1.2, between 5% and 30% on my machine. That's because objectify suffers a lot from document merging, as assigning elements to other element's attributes does exactly that. Note that 1.2 is somewhat slower than 1.1.2 in a couple of places. In total, the new version is more or less as fast as 1.1.2 was, sometimes faster, sometimes slower.
The etree benchmark results are less interesting. I just ran the document merging benchmarks and there is not much of a difference to see here. The results are all rather close across the three versions.
Another thing that surprised me: it doesn't seem to make that a big difference if threading support is compiled in or not. Some benchmarks get faster if it is disabled (meaning: no locking etc.), but most of them stay about the same. So, this can make a difference in certain situations, but it's not enough to consider disabling it by default or something.
While I was at it, I also added a few more checks for the migrated namespace references. The redundant ones are now freed when moving elements between documents. I can't tell if this was the case before (I believe they were just kept on the copied element), but it definitely works now.
So, I'm quite happy with the results so far. There may still be some space left for optimisations, but it's not too urgent as it seems. And namespace handling definitely has much better semantics now.
Have fun, Stefan

Hey Stefan
On 2007-02-25 10:38:47 +0100, Stefan Behnel stefan_ml@behnel.de said:
Stefan Behnel wrote:
I finally found the time to take a second look back at the nscleanup branch. [...] I also expect it to be faster than the previous version - although I haven't done the benchmarks yet to prove it.
I did some now. It looks like most benchmarks for objectify get faster compared to 1.2, between 5% and 30% on my machine. That's because objectify suffers a lot from document merging, as assigning elements to other element's attributes does exactly that. Note that 1.2 is somewhat slower than 1.1.2 in a couple of places. In total, the new version is more or less as fast as 1.1.2 was, sometimes faster, sometimes slower.
[...]
So, I'm quite happy with the results so far. There may still be some space left for optimisations, but it's not too urgent as it seems. And namespace handling definitely has much better semantics now.
Wow, that was quick :)
Thanks for the integration. My case works like charm now (makeelement and append or insert).
*anxioulsy waiting for the release* :)
participants (2)
-
Christian Zagrodnick
-
Stefan Behnel