[lxml-dev] memory management strategies
Hmm... I've got too many email addresses and the lxml mailing list held up my last email.
Lets try that again.
I've released the first embryonic version of the libxml2 wrapper that I've been working on. I'm not feeling very witty so I called it vlibxml2.
Short blurb: http://www.crankycoder.com/archives/2004/10/vlibxml2_first.html
The tarball: http://www.crankycoder.com/archives/vlibxml2-0.0.1.tar.gz
There's a bunch of stuff that still needs work, but in general, I think the code is not so embarrassing that I have to hide it anymore.
I've tried to maintain API compatibility with libxml2 - with the exception that I do garbage collection for you. I chose not to use ElementTree as the base line, mainly because I don't want to have to think too much as I add in more features from libxml2's C API.
Adding abstractions like ElementTree over top of a properly garbage collected libxml2 should be easy anyway - I'm more concerned with having a complete XML featureset.
How to use it:
Easy - it's basically a really really stripped down libxml2.
node = vlibxml2.newNode('foo') node.setProp('bar', '123') batz= vlibxml2.newNode('batz') node.addChild( batz )
assert node.children() == batz assert batz.parent() == node assert node.prop('bar') == '123' assert node.xpathEval('//batz') == batz
That's pretty much everything for now.
Stuff that will get fixed in the immediate future (in order):
- get a proper memory leak detector into vlibxml2. I'm planning on porting the libxml2 memory leak detector over. I'm currently just asserting that in DEBUG mode, the garbage collector never free's the same pointer twice. This could potentially give you a false positive failure in the case that you free a node, allocate a new node and free the new node, but the new node has the same address as the old node. Anyway - I really want to fix it as it's the only way I'll really feel comfortable with the memory management.
- currently all xmlNode objects belong to some xmlDoc object. Creating a new xmlNode object implicitly creates a root xmlDoc object. This is not the same behavior as libxml2, so I should probably not do the same thing. Plus it's a double malloc/free which is never a really good thing.
- add namespace support. I want to be able to generate and parse SOAP, XML-RPC and RSS/Atom. That means I need to suport xmlNamespace stuff.
- add xmlReader support. I actually have a real need to parse XML files of ~100MB. I'd like to be able to handle close to 500MB of data if possible. It should at least be possible to do something like that - and that means having an alternate to the DOM way of doing things. Getting this stuff to work may be finicky as it probably involves playing with the global interpreter lock as libxml2 will need to make callbacks to Pyrex code. Luckily - there's been some example code posted to the Pyrex mailing list recently so I'm not too intimidated by this.
==The sort've bad news==
This release is being released under the GPL. I'm not-so-slowly turning into a free software nerd from an open source nerd. This may make some people mad - I'm open to changing the license, but for now I'm most comfortable with the GPL. If I hear a good argument, I'd switch to the PSF Python license, but for now - it's GPL.
I've decided to keep all the code on my own SVN repository - I need to sort out some security issues before I start expose the repository to other people. I've already got Trac installed (http://trac.edgewall.com) for bug tracking and wiki documentation. Again - it's just a matter of setting up security and splicing up projects in Trac, but it's coming.
--- Don't be humble ... you're not that great. -- Golda Meir
On 15-Oct-04, at 10:09 AM, Fred Drake wrote:
On Fri, 15 Oct 2004 04:14:07 +0200, Martijn Faassen firstname.lastname@example.org wrote:
If you want checkin rights for lxml, check with Philipp (see mail to Victor).
Got those already, just haven't had any time to work on them.
-- Fred L. Drake, Jr. <fdrake at gmail.com> Zope Corporation _______________________________________________ lxml-dev mailing list email@example.com http://codespeak.net/mailman/listinfo/lxml-dev