Threading problem when many sockets open

Philip Zigoris philip at corp.spock.com
Sun Aug 12 00:32:30 CEST 2007


Hi all,

I have written a socket based service in python and under fairly heavy
traffic it performs really well.  But i have encountered the following
problem: when the system runs out of file descriptors, it seems to
stop switching control between threads.

Here is some more detail:

The system has n+2 threads, where n is usually around 10.  This was
implemented using the 'threading' and 'socket' modules in the python
2.5 standard library.

-- The "master" thread accepts new socket connections and then
enqueues the connection on a Queue (from the standard library).

-- There are n "handler threads" that pop a connection off of the
queue, read some number of bytes (~10), do some processing and then
send ~100 bytes back over the connection and close it.

-- The last thread is just a logging thread that has a queue of
messages that it writes to either stdout or a file.

Under pretty substantial load, the processing is quick enough that the
connections do not pile up very quickly.  But, in some cases they do.
And what I found was that as soon as the size of the Queue of
connections reached a high enough number (and it was always the same),
all of the processing seemed to stay in the "master" thread.  I
created a siege client that opened up 1000+ connections and the server
would go into a state where the master thread repeatedly polled the
socket and printed an error.  The Queue size stayed fixed at 997 even
if i shut down all of the connectionso n the client side.  Under
normal operating conditions, those connections would be detected as
broken in the handler thread and a message would be logged.  So it
seems that the "handler" threads aren't doing anything. (And they are
still alive, i check that in the master thread).

OK.  I hope all of this is clear.  Currently, I've solved the problem
by putting a cap on the queue size and i haven't seen the problem
reoccur.  But it would be nice to understand exactly what went wrong.

Thanks in advance.

-- 
--
Philip Zigoris I SPOCK I 650.366.1165
Spock is Hiring!
www.spock.com/jobs



More information about the Python-list mailing list