I have developed a web server based on twisted-web that acts as a gateway between the SRU search protocol and the Z39.50 search protocol. ( Details at http://www.loc.gov/z3950/agency/ ) Each SRU search request opens a thread which performs a Z39.50 search on a remote target and then returns the results which are converted into the SRU response. Sometimes the connection with the client issuing the SRU request is lost. In a few cases this produces the message "Connection to the other side was lost in a non-clean fashion: Connection lost. ". When several of these non-clean connection losses occur the web server will become unresponsive and will not respond to any new requests. Can anyone help with how I should respond to these non-clean lost connections ?. I can't find any documentation which helps me understand the impact of this type of error. As the connection is lost there is no point in returning any error response, but are there systems resources left allocated which might be causing the server to become unresponsive. Bill -- Bill Oldroyd Technical consultant for The European Library
On 11:37 am, billoldroyd@gmail.com wrote:
I have developed a web server based on twisted-web that acts as a gateway between the SRU search protocol and the Z39.50 search protocol. ( Details at http://www.loc.gov/z3950/agency/ )
Cool. I dabbled with Z39.50 very briefly a while ago. Any chance you'll develop a Twisted-based implementation of this protocol? :)
Each SRU search request opens a thread which performs a Z39.50 search on a remote target and then returns the results which are converted into the SRU response.
Sometimes the connection with the client issuing the SRU request is lost. In a few cases this produces the message "Connection to the other side was lost in a non-clean fashion: Connection lost. ". When several of these non- clean connection losses occur the web server will become unresponsive and will not respond to any new requests.
I think it's unusual that the server would become unresponsive after several lost client connections. Are you using deferToThread to do the Z39.50 searches? If so, then the task should complete (or not) regardless of what happens to the web connection. deferToThread uses a thread pool, though. Once all the threads in the pool are in use, further jobs won't even be started until one of the threads becomes available. Perhaps some searches take a long, long time? This would tie up thread pool resources, preventing new jobs from being processed, and perhaps correlate somewhat with lost client connections (which give up after waiting for such a long time for their results).
Can anyone help with how I should respond to these non-clean lost connections ?. I can't find any documentation which helps me understand the impact of this type of error. As the connection is lost there is no point in returning any error response, but are there systems resources left allocated which might be causing the server to become unresponsive.
I wrote about how to do this: http://jcalderone.livejournal.com/50890.html So, regardless of the thread pool stuff I talked about above, that should help you to at least handle the connection lost errors and clean up other resources. Jean-Paul
participants (2)
-
Bill Oldroyd
-
exarkun@twistedmatrix.com