We are writing an ETL tool and we use lxml to parse a lot of XML files. The
problem we are having is that lxml uses a considerable amount of memory
which it doesn't release. I've already disabled caching of 'ID's. I've read
in the archives of this list that lxml also caches a lot of other strings.
By itself this isn't a problem, but the fact that this cache isn't cleared
when lxml is "done" is a problem. Once the XML files have been imported the
other stages in the ETL pipeline also need memory.
Is there a way to clear this cache from client code?