[Tutor] RegEx [Was: Parsing iptables log files]
Amaya Rodrigo Sastre
arodrigo@genasys.com
Wed, 4 Sep 2002 16:34:26 +0200
I am now struggling with the regex:
One sample line in my logs looks like this:
Aug 17 20:41:55 martinika kernel: --logtrack-- IN= OUT=lo SRC=192.168.100.10 DST=192.168.100.10 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=43085 DPT=80 SEQ=307515611 ACK=0 WINDOW=32767 RES=0x00 SYN URGP=0
And my regex is:
#!/usr/bin/python
import re
my_re = "(\w+\s\d+\s\d+:\d+:\d+).+SRC=([\d.]+)\s+DST=([\d.]+)\s(.*TTL=([\d]))+ID=([\d.]+)\s+(.*TCP|UDP)\s+SPT=(\d+)\s+DPT=(\d+)\s+SEQ=(\d+)\s+ACK=(\d+)"
# This one works, but doesn't match ID:
# my_re = "(\w+\s\d+\s\d+:\d+:\d+).+SRC=([\d.]+)\s+DST=([\d.]+)\s(.*TTL=([\d]))+ID=([\d.]+)\s+(.*TCP|UDP)\s+SPT=(\d+)\s+DPT=(\d+)\s+SEQ=(\d+)\s+ACK=(\d+)')"
pattern = re.compile(r'(my_re)')
requests = {}
my_line=0
my_file = open('amaya')
for line in my_file.xreadlines():
print line
match = pattern.search(line)
#if not match: continue
#if not match:
# print "No"
# continue
date = match.group(1)
src_addr = match.group(2)
dst_addr = match.group(3)
p_id = match.group(5)
src_port = int(match.group(7))
dst_port = int(match.group(8))
seq = match.group(9) # seq and ack are too big for int, they
ack = match.group(10) # need long, so i left them as strings
print match.groups() # for debugging
my_line = my_line + 1
my_file.close()
The first, uncommented regex finds nothing.
The commented one finds this:
[0, 'Aug 17', '20:41:55', '192.168.100.10', '192.168.100.10', 43085, 80, '307515611', '0']
I now want to match find the ID= field, but I don't seem to be able.
I have gone through http://py-howto.sourceforge.net/regex/ and couldn't find a
reason for my regex not to work...
--
Amaya M. Rodrigo Sastre Genasys II Spain, S.A.U.
MLS Sysadmin Ventura de la Vega, 5.
Phone: +34.91.3649100 28014 Madrid. Spain