Do non-error log messages continue to appear in the Twisted log? ie, is it clear that the logging system is still working, or could it have failed in some way, obscuring any exception reports?
Yes, I print the results of garbage collection every 10 min like this: print "number of objects tracked by the garbage collector is:", len(gc.get_objects()) and they appear in the main twisted log. 2008/09/10 13:21 -0700 [-] number of objects tracked by the garbage collector is: 860021 2008/09/10 13:31 -0700 [-] number of objects tracked by the garbage collector is: 864316
Any new unhandled errno values should definitely result in an exception being logged (notice that the `raiseĀ“ which follows the checks for various errno values is inside a try/except which logs any exception).
I noticed that raise too... Could it then be EWOULDBLOCK, EAGAIN or EPERM? except socket.error, e: if e.args[0] in (EWOULDBLOCK, EAGAIN): self.numberAccepts = i break elif e.args[0] == EPERM: # Netfilter on Linux may have rejected the # connection, but we get told to try to accept() # anyway. continue I am not sure how to debug this problem- I have another twisted server of a different type on that machine, and while the problematic server stops accepting connections, the second one works just fine, so this is not a machine-wide issue. What could it be?
-----Original Message----- From: twisted-python-bounces@twistedmatrix.com [mailto:twisted-python- bounces@twistedmatrix.com] On Behalf Of Jean-Paul Calderone Sent: Wednesday, September 10, 2008 1:44 PM To: Twisted general discussion Subject: Re: [Twisted-Python] intermittent problem: not accepting new connections
I have had a twisted epoll server that was heavily used, such that it saturated CPU (100% shown by "top", about 5000 connections, intense message relaying). I am using twisted 2.5.0 that I patched for epoll bug. It was run on python 2.4.4 , 2.6.11 kernel on a single core xeon 3.0 GHz CPU. This server has been on for many months, and it has been rock- stable.
A couple of days ago I migrated that server to a newer machine: same
twisted 2.5.0, same python 2.4.4, newer 2.6.24 kernel and a quad core xeon L5420 CPU. CPU usage dropped from 100% to 30%, as expected, with the same rate of client connections.
However the server now has the following intermittent problem: about twice a day, it stops accepting new connections for a short period of 5-10 minutes.
telnet times out, I get this: root@serv2:/proc/net/netfilter# telnet localhost 5229
Trying 127.0.0.1...
Existing connections are not cut, they server receives/delivers messages to/from them just fine. These short periods of not accepting connections do not correlate with increased CPU load or with the overall number of connections to the server.
I have had a problem with the same symptoms before, when a server
On Wed, 10 Sep 2008 13:32:06 -0700, Alec Matusis <matusis@yahoo.com> wrote: patched process
run out of its quota of file descriptors. However, there were clear messages in the twisted log at that time, and upping the ulimits solved the problem. This time, there are no errors in ANY logs (twisted log. /var/log/messages, etc)
Do non-error log messages continue to appear in the Twisted log? ie, is it clear that the logging system is still working, or could it have failed in some way, obscuring any exception reports?
I am out of ideas on what this could be, because my setup is exactly
the
same as I have been using in the last year, except for a faster CPU and a newer kernel?
I suspect that there are some new uncaught accept() exceptions in internet/tcp.py in the part where it's looking for EMFILE, ENOBUFS, ENFILE, ENOMEM, ECONNABORTED errors.
Any new unhandled errno values should definitely result in an exception being logged (notice that the `raiseĀ“ which follows the checks for various errno values is inside a try/except which logs any exception).
Jean-Paul
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python