Hi there,
Inspired by discussions with Vic and through browsing the vlibxml2 code,
I've implemented bit of memory management functionality which, after a
lot of manual debugging, seems to be doing the right thing. So far...
A new addition to the lxml trunk is nodereg, and associated testing
stuff (noderegtest.pyx and test_nodereg.py). nodereg is a system for
registering Python-level node proxies, plus some base classes for the
document and node objects in a typical libxml2 tree wrapper.
The nodereg module functionality can be used to make sure that memory
(in particular libxml2 tree nodes) gets collected when it is possible,
and not before. :) This sounds easy, but it is surprisingly tricky.
Next:
* look into hooking in libxml2's memory debugging functionality for
testing. Investigate Vic's code in that area/get Vic's advice.
* start rewriting etree, dom, or vlibxml2 to use nodereg. This will
likely further evolve nodereg.
* Add more functionality to nodereg. One thing that currently is not
handled is attribute nodes, for instance.
* Optimize nodereg. The strategy currently employed requires, in the
worst case, a lot of full-tree walks to determine whether a node in the
tree can be successfully garbage collected. We need to come up with some
smart algorithm/datastructure to avoid this having to happen to often.
Another thing I would really like to do is investigate adding weakref
support to Pyrex. Right now I had to first jump through a bit of a hoop
to make it work. Then later on I took a long time debugging an obscure
case where there would be a remaining refcount on an object even if the
only object still pointing to the object was a WeakValueDictionary. I
finally traced it down to Pyrex introducing this. I'm not clear why, but
somehow the base class got involved (which was not weakreferenceable as
defined by Pyrex). This somehow managed to trick the object into keeping
a reference while it shouldn't, causing it never to be deallocated.
Being able to just say 'this class can be weakreferenced' in Pyrex
should make this go away.
Regards,
Martijn