CPU usage while reading a named pipe

Sat Sep 12 12:39:51 EDT 2009

Miguel P wrote:
> Hey everyone,
> 
> I've been working on parsing (tailing) a named pipe which is the
> syslog output of the traffic for a rather busy haproxy instance. It's
> a fair bit of traffic (upto 3k hits/s per server), but I am finding
> that simply tailing the file  in python, without any processing, is
> taking up 15% of a CPU core. In contrast HAProxy takes 25% and syslogd
> takes 5% with the same load. `cat < /named.pipe` takes 0-2%
> 
> Am I just doing things horribly wrong or is this normal?
> 
> Here is my code:
> 
> from collections import deque
> import io, sys
> 
> WATCHED_PIPE = '/var/log/haproxy.pipe'
> 
> if __name__ == '__main__':
>     try:
>         log_pool = deque([],10000)
>         fd = io.open(WATCHED_PIPE)
>         for line in fd:
>             log_pool.append(line)
>     except KeyboardInterrupt:
>         sys.exit()
> 
> Deque appends are O(1) so that's not it. And I am using 2.6's io
> module because it's supposed to handle named pipes better. I have
> commented the deque appending line and it still takes about the same
> CPU.
> 
> The system is running Ubuntu 9.04 with kernel 2.6.28 and ext4 (not
> sure the FS is relevant).
> 
> Any help bringing down the CPU usage would be really appreciated, and
> if it can't be done I guess that's ok too, server has 6 cores not
> doing much.

Is this any faster?

     log_pool.extend(fd)