
Greetings, I've been using 2.0 for a while and today I've decided to upgrade to the most recent 2.0.7. I got a problem, and, by binary search (based on change log) :) I found it in 2.0.5 first - it is the local file DTD resolver. This issue originates in http://article.gmane.org/gmane.comp.python.lxml.devel/3499 Eventually I have to load DTD in some specific cases for parsing. Even if I load it from local disc and cache it, the parsing time is longer up to 10 times (40ms instead of 4ms). So, I came up to the following (ugly) solution: class LocalDTDResolver(etree.Resolver): def __init__(self, conf): self.conf = conf self.cached = None def resolve(self, url, id, context): if not self.cached: self.cached = self.resolve_filename( self.conf + '/vxml.dtd' , context ) return self.cached class LxmlUser(...): # just the relevant snippets def __init__(...) self.xmlParser = etree.XMLParser(no_network=True, resolve_entities=False, load_dtd=False) self.resolvingParser = etree.XMLParser(no_network=False, resolve_entities=False, load_dtd=True) self.resolvingParser.resolvers.add(LocalDTDResolver(local_path)) def call_parser(self, replies): for data in replies: if need_resolve: parser = self.resolvingParser else: parser = self.xmlParser xmlres = etree.parse( StringIO.StringIO( data ), parser ) Systems are FreeBSD 6.2/7.0, lxml.etree: (2, 0, 5, 0) libxml used: (2, 6, 30) libxml compiled: (2, 6, 30) libxslt used: (1, 1, 22) libxslt compiled: (1, 1, 22) This code is run within mod_python3/apache2.2.8 Up to 2.0.5 I have no problem when the resolvingParser is called. But since 2.0.5 after I have this: # no call of resolving parser [root@machine ~/trunk/fb-ports/py-lxml]$ sysctl kern.openfiles kern.openfiles: 377 # after a single (!) call of resolving parser [root@machine ~/trunk/fb-ports/py-lxml]$ sysctl kern.openfiles kern.openfiles: 11439 And my local DTD file is opened about 11000 times (according to fstat and find -inode). Am I doing something wrong in such a way of coding or it is a bug? Cheers, Dmitri