Re: [lxml-dev] etree.parse hangs with a lot of parallel requests
![](https://secure.gravatar.com/avatar/4d588f4b484ccadbb9e766e4df45b992.jpg?s=120&d=mm&r=g)
Hi! Speaking again about the issue with DTD loading, parsing etc.
It seems you only want to parse DTDs locally from disc, so setting "no_network=True" (which is the default in lxml 2.0) should prevent any accidental remote access.
Eventually it turned out that I'm working fine without DTD. So, setting no_network = True and load_dtd = False really solved the problem.
Hmm, do you really need to turn off DTD loading or is disabling network access enough? I wouldn't expect loading the DTD from the disk cache to take that much time (although, if you can live without it and time is really critical, then it's obviously better to safe that bit of time also).
I was wrong - I do need DTD to resolve entities correctly. Sometimes I got the html and things like these. My DTD included all the required entities, but it is referenced by URL. And the only way to deal with this enity is to load the DTD, isn't it? Which options do I have except of switching URL to a local path in SYSTEM definition? Setting up the DTD catalog on every machine that runs the application? The ideal option would be to tell the parser "load the given DTD from a given location(i.e. disk) and use it from now and on for parsing all incoming data", but is it possible? Cheers, Dmitri
![](https://secure.gravatar.com/avatar/8b97b5aad24c30e4a1357b38cc39aeaa.jpg?s=120&d=mm&r=g)
Hi, Dmitri Fedoruk wrote:
Which options do I have except of switching URL to a local path in SYSTEM definition? Setting up the DTD catalog on every machine that runs the application? The ideal option would be to tell the parser "load the given DTD from a given location(i.e. disk) and use it from now and on for parsing all incoming data", but is it possible?
You can use a custom resolver and cache the DTD (by its URL) once its loaded. http://codespeak.net/lxml/resolvers.html Stefan
participants (2)
-
Dmitri Fedoruk
-
Stefan Behnel