[Twisted-Python] Seeking advice: Coping with heavy log file loads
There may not be a clear answer to this problem, and the answer may or may not be Twisted-specific. I'm just trying to fish for ideas here. The server I'm working on suffers from extremely high CPU time spent in the logging functions. I'm spending between 30 and 40% of all CPU time just in twisted.python.log.msg. Describing the environment: We use DailyLogFiles for file rotation along with the PythonLoggingObserver to allow for logLevel support. (Of course, this gives us significant overhead from Python's complex logging architecture; maybe we should drop this...) On a 1000 request test of this server, we generate 58,274 calls to t.p.log.msg. There's room for cleanup, but there's lots of pressure to keep many of the log calls we have for forensic purposes, so we can only cut this back so far. Anyway, does anyone have any ideas for speedups? Any stories from experience especially? Shots in the dark are welcome. ;-) - Paul Goins
On Fri, 2010-04-09 at 10:35 +0200, Paul Goins wrote:
There may not be a clear answer to this problem, and the answer may or may not be Twisted-specific. I'm just trying to fish for ideas here.
The server I'm working on suffers from extremely high CPU time spent in the logging functions. I'm spending between 30 and 40% of all CPU time just in twisted.python.log.msg.
Describing the environment:
We use DailyLogFiles for file rotation along with the PythonLoggingObserver to allow for logLevel support. (Of course, this gives us significant overhead from Python's complex logging architecture; maybe we should drop this...)
On a 1000 request test of this server, we generate 58,274 calls to t.p.log.msg. There's room for cleanup, but there's lots of pressure to keep many of the log calls we have for forensic purposes, so we can only cut this back so far.
Anyway, does anyone have any ideas for speedups? Any stories from experience especially? Shots in the dark are welcome. ;-)
A request/log ratio of 1:60 is extreme. I would call that a log generator ;-) I think you could try to use a custom light weight log function that just queues the log messages (FIFO) and bursts them, say every second, to the log file (e.g. in a thread to use a multicore cpu).
On Fri, 2010-04-09 at 11:49 +0200, Kees Bos wrote:
I think you could try to use a custom light weight log function that just queues the log messages (FIFO) and bursts them, say every second, to the log file (e.g. in a thread to use a multicore cpu).
Or: 0. Profile logging system - what exactly is it doing that's using all that CPU? If you're unlucky, it's just "lots of Python function calls", but maybe it's fixable. 1. A log function that sends messages (over e.g. a Unix socket) to another process that does the writing. Python GIL means you don't get that much benefit from multiple threads. 2. Rewrite log function in C... and then maybe run it in another thread.
On Fri, 2010-04-09 at 15:23 +0200, Itamar Turner-Trauring wrote:
On Fri, 2010-04-09 at 11:49 +0200, Kees Bos wrote:
I think you could try to use a custom light weight log function that just queues the log messages (FIFO) and bursts them, say every second, to the log file (e.g. in a thread to use a multicore cpu).
Or:
0. Profile logging system - what exactly is it doing that's using all that CPU? If you're unlucky, it's just "lots of Python function calls", but maybe it's fixable.
1. A log function that sends messages (over e.g. a Unix socket) to another process that does the writing. Python GIL means you don't get that much benefit from multiple threads. Duh. Of course.
BTW. syslog from the syslog module on linux just opens a unix socket to /dev/log and could be used to do just this. (e.g. with log level LOG_LOCAL0 and configuring your syslog to log facility local0 to a specific file)
On Fri, Apr 09, 2010 at 05:35:51PM +0900, Paul Goins wrote:
We use DailyLogFiles for file rotation along with the PythonLoggingObserver to allow for logLevel support. (Of course, this gives us significant overhead from Python's complex logging architecture; maybe we should drop this...)
I don't know if this will help you, but why do you need to route through Python's logging framework for log-level support? For example, Twisted's SyslogObserver support allows the use of syslog log levels without involving Python's logging system: http://twistedmatrix.com/documents/current/api/twisted.python.syslog.SyslogO... If Python's logging framework doesn't show up in your profile output as an appreciable percentage of the time spent in Twisted's logging framework, I guess you might as well leave it alone - but if it is contributing, it shouldn't be too hard to figure out a way to work around it.
Hi Paul, Is is sporadic? (the CPU burst) If so, you may be seeing the log rotation code kicking...and it's probably blocking your app too until it completes. I've got a couple patches to fix the blocking issue: http://twistedmatrix.com/trac/ticket/4372 The patch which eliminates old log rotation will probably get rid of your CPU problem too. -J On Fri, Apr 9, 2010 at 2:35 AM, Paul Goins <general@vultaire.net> wrote:
There may not be a clear answer to this problem, and the answer may or may not be Twisted-specific. I'm just trying to fish for ideas here.
The server I'm working on suffers from extremely high CPU time spent in the logging functions. I'm spending between 30 and 40% of all CPU time just in twisted.python.log.msg.
Describing the environment:
We use DailyLogFiles for file rotation along with the PythonLoggingObserver to allow for logLevel support. (Of course, this gives us significant overhead from Python's complex logging architecture; maybe we should drop this...)
On a 1000 request test of this server, we generate 58,274 calls to t.p.log.msg. There's room for cleanup, but there's lots of pressure to keep many of the log calls we have for forensic purposes, so we can only cut this back so far.
Anyway, does anyone have any ideas for speedups? Any stories from experience especially? Shots in the dark are welcome. ;-)
- Paul Goins
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Definitely use syslog. To see how much logging is costing you without changing code, you can simply route to /dev/null via syslog. Btw, w/ local logging, you may not see it in the cpu usage - more likely in the i/o wait. In my experience, logging to a remote syslog-ng server over udp has very little impact on the app, almost like logging to /dev/null (which in turn is like not logging at all). Are you by chance on AWS/EC2 ? If so, please contact me to discuss possible design partnership. m On Fri, Apr 9, 2010 at 11:57 AM, Jason J. W. Williams < jasonjwwilliams@gmail.com> wrote:
Hi Paul,
Is is sporadic? (the CPU burst) If so, you may be seeing the log rotation code kicking...and it's probably blocking your app too until it completes.
I've got a couple patches to fix the blocking issue:
http://twistedmatrix.com/trac/ticket/4372
The patch which eliminates old log rotation will probably get rid of your CPU problem too.
-J
On Fri, Apr 9, 2010 at 2:35 AM, Paul Goins <general@vultaire.net> wrote:
There may not be a clear answer to this problem, and the answer may or may not be Twisted-specific. I'm just trying to fish for ideas here.
The server I'm working on suffers from extremely high CPU time spent in the logging functions. I'm spending between 30 and 40% of all CPU time just in twisted.python.log.msg.
Describing the environment:
We use DailyLogFiles for file rotation along with the PythonLoggingObserver to allow for logLevel support. (Of course, this gives us significant overhead from Python's complex logging architecture; maybe we should drop this...)
On a 1000 request test of this server, we generate 58,274 calls to t.p.log.msg. There's room for cleanup, but there's lots of pressure to keep many of the log calls we have for forensic purposes, so we can only cut this back so far.
Anyway, does anyone have any ideas for speedups? Any stories from experience especially? Shots in the dark are welcome. ;-)
- Paul Goins
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Just wanted to respond to the input everyone's given. Thanks a lot; I'm glad I asked. I'll probably look into the syslog route. I thought about it before, but I wasn't sure how much flexibility it gives. Given the suggestions here, I've since looked into it a little more and it seems to be the smartest route, especially due to the size of the load. Thanks everyone! - Paul
You might find the logging infrastructure in Foolscap interesting: http://foolscap.lothar.com/docs/logging.html The basic idea is to record log events only in memory, in separate circular buffers for each priority level. When something "weird" happens, all the events are bundled into an "Incident File", and then the incident file is written out (either to disk, or to a gatherer process over a network). Other tools are then used to reassemble the pieces of the incident into chronological order and point out where the gaps are. I haven't done any specific performance measurements, but I've taken a busy server and stubbed out the logging calls and didn't see a huge change in throughput rate. The Foolscap approach gets you useful (and detailed) information when bad things happen, but throws out most of the other stuff, so it doesn't touch the disk or make system calls very often. cheers, -Brian (author of Foolscap)
participants (7)
-
Brian Warner
-
Itamar Turner-Trauring
-
Jason J. W. Williams
-
Kees Bos
-
Marc Byrd
-
Paul Goins
-
Tim Allen