[lxml-dev] Threading branch merged into trunk

Hello everyone, I managed to fix a couple of bugs that remained the threading branch and merged it into the trunk. As it looks now, thread support will become an official feature in 1.1. Parsers now use locks to prevent concurrent access, which means that the access to a single parser is completely serialised. If you want concurrency, you can either use the default parser (which is now cloned for each thread on first request) or simply copy() the parser of your choice to use it in a thread. Note that calls to XML() and HTML() always use the same parser respectively, so these are non-threaded. According to the libxml2 docs, validation is supposed to be thread-safe, so you should be able to share RelaxNG and XMLSchema objects between threads. I did not test this, though. Note also that the error_log property is not currently thread aware, so error reporting may not work as expected. I will also take a look at XSLT, which is said to be thread-safe in the docs. Hoever, the current implementation does not allow this, as it holds further state within the XSLT objects. I'll have to fix that somehow. There are currently some things to remember: sharing generated trees between threads should be ok, but you should avoid copying elements between trees that were generated in different threads. This is error prone, as part of the data remains in the originating thread (the parser dictionary), so if the elements are removed from the document and the originating thread is terminated, this data may be freed without warning! It should be safe to do the parse-generate-serialise cycle inside a thread and it should be safe to set up trees, validators, etc. in the main program and use it in threads. However, things like creating a new thread for each parse process, terminating it when parsing is done and then starting to use the generated tree in other threads are generally A BAD IDEA. We could fix that by not using parser dictionaries in threads, but that would reduce the parser performance. I'll just write up a doc/threading.txt (and also a FAQ entry) to clearly mark this kind of stuff as a no-no. If things work out well, there will be an alpha release, uhm, in the not so far future. :) Stefan
participants (1)
-
Stefan Behnel