[Tutor] Simple Stats on Apache Logs
jlintz at gmail.com
Fri Feb 12 06:15:33 CET 2010
On Thu, Feb 11, 2010 at 4:56 AM, Lao Mao <laomao1975 at googlemail.com> wrote:
> I have 3 servers which generate about 2G of webserver logfiles in a day.
> These are available on my machine over NFS.
> I would like to draw up some stats which shows, for a given keyword, how
> many times it appears in the logs, per hour, over the previous week.
> So the behavior might be:
> $ ./webstats --keyword downloader
> Which would read from the logs (which it has access to) and produce
> something like:
> 0000: 12
> 0100: 17
> I'm not sure how best to get started. My initial idea would be to filter
> the logs first, pulling out the lines with matching keywords, then check the
> timestamp - maybe incrementing a dictionary if the logfile was within a
> certain time?
> I'm not looking for people to write it for me, but I'd appreciate some
> guidance as the the approach and algorithm. Also what the simplest
> presentation model would be. Or even if it would make sense to stick it in
> a database! I'll post back my progress.
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
You may also find this link useful
http://effbot.org/zone/wide-finder.htm on parsing logs efficiently
More information about the Tutor