Parse a log file
Tim Chase
python.list at tim.thechases.com
Mon Jan 18 16:56:40 EST 2010
kaklis at gmail.com wrote:
> I want to parse a log file with the following format for
> example:
> TIMESTAMPE Operation FileName
> Bytes
> 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151
> 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151
> 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151
> 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151
> 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151
> 12/Jan/2010:16:05:05 +0200 DELETE sample3.3gp 37151
>
> How can i count the operations for a month(e.g total of 40 Operations,
> 30 exists, 10 delete?)
It can be done pretty easily with a regexp to parse the relevant
bits:
import re
r = re.compile(r'\d+/([^/]+)/(\d+)\S+\s+\S+\s+(\w+)')
stats = {}
for line in file('log.txt'):
m = r.match(line)
if m:
stats[m.groups()] = stats.get(m.groups(), 0) + 1
print stats
This prints out
{('Jan', '2010', 'EXISTS'): 5, ('Jan', '2010', 'DELETE'): 1}
With the resulting data structure, you can manipulate it to do
coarser-grained aggregates such as the total operations, or remap
month-name abbreviations into integers so they could be sorted
for output.
-tkc
More information about the Python-list
mailing list