Hi everybody,
I'm trying to figure out strategies to manage memory automatically in
lxml in the context of libxml2 trees. Some of my incomplete thinking and
description of the problem has been encapsulated in the following document:
http://codespeak.net/svn/lxml/trunk/doc/memorymanagement.txt
It's quite possible that I'm making things more complicated than
necessary, as Vic said. I'm quite curious to hear Vic's and other
people's thinking about this.
The basic concept is that libxml2 maintains a tree representing XML.
This tree can be manipulated by a host of C functions. lxml wraps this
tree in two ways now. The wrapping is done by proxies written in Pyrex
that typically stand in for a node in the libxml2 tree and expose
informationon it.
One wrapper, the one I started first, aims to provide an ElementTree API
to the libxml2 tree. ElementTree is a pythonic and lightweight way to
manipulate XML written by Fredrik Lundh, and lxml just tries to mimic
that API as much as possible. The wrapping is currently incomplete,
mostly due to the thorny memory management issues.
The other wrapper which I started later provides a classical W3C DOM
API. It's pretty good in the real-only department already, but as soon
as I started considering writeable DOM the memory management issues came
up again in full force.
Help, please!
Regards,
Martijn