CPU usage while reading a named pipe

Sat Sep 12 11:08:56 EDT 2009

Hey everyone,

I've been working on parsing (tailing) a named pipe which is the
syslog output of the traffic for a rather busy haproxy instance. It's
a fair bit of traffic (upto 3k hits/s per server), but I am finding
that simply tailing the file  in python, without any processing, is
taking up 15% of a CPU core. In contrast HAProxy takes 25% and syslogd
takes 5% with the same load. `cat < /named.pipe` takes 0-2%

Am I just doing things horribly wrong or is this normal?

Here is my code:

from collections import deque
import io, sys

WATCHED_PIPE = '/var/log/haproxy.pipe'

if __name__ == '__main__':
    try:
        log_pool = deque([],10000)
        fd = io.open(WATCHED_PIPE)
        for line in fd:
            log_pool.append(line)
    except KeyboardInterrupt:
        sys.exit()

Deque appends are O(1) so that's not it. And I am using 2.6's io
module because it's supposed to handle named pipes better. I have
commented the deque appending line and it still takes about the same
CPU.

The system is running Ubuntu 9.04 with kernel 2.6.28 and ext4 (not
sure the FS is relevant).

Any help bringing down the CPU usage would be really appreciated, and
if it can't be done I guess that's ok too, server has 6 cores not
doing much.