
I am trying to port an application to Windows. It works with no problems on Linux. On Windows, lxml parses a large file with no problems on the main thread. However, I am getting a segfault when it parses the same file from a spawned thread. Has anyone seen the same thing. Python 2.4.2 lxml 1.0 beta from binary built on May 18 Let me know if there is anything else I can provide. Thanks, Scott Haeger

Hi Scott, Scott Haeger wrote:
I am trying to port an application to Windows. It works with no problems on Linux. On Windows, lxml parses a large file with no problems on the main thread. However, I am getting a segfault when it parses the same file from a spawned thread. Has anyone seen the same thing.
Threading has never been officially tested. However, we would love to see it work, so your feedback is much appreciated. Note that there are a few places in lxml where functions and classes explicitly state what you must not do within threads (we know that at least). One example is that you must not use the default parser from different threads, which might already be the problem in your case. You must create an independent parser for each thread. This is mainly done for performance reasons. Note that you can .copy() parsers to keep the initial configuration. It's interesting that you do not have the same problem under Linux, though...
Python 2.4.2 lxml 1.0 beta from binary built on May 18
Let me know if there is anything else I can provide.
It would be great if you could come up with a short code snippet that shows the problem. Most likely, the problem is not the long file, but the threading itself, so you can try to replace the file parsing by a short XML string. We do not currently have test cases for threading, but it would be very helpful if we had a number of tests that could help us in making sure threading works out-of-the-box. Stefan
participants (2)
-
Scott Haeger
-
Stefan Behnel