[Tutor] RegEx [Was: Parsing iptables log files]

Amaya Rodrigo Sastre arodrigo@genasys.com
Wed, 4 Sep 2002 16:34:26 +0200


I am now struggling with the regex:

One sample line in my logs looks like this:

Aug 17 20:41:55 martinika kernel: --logtrack-- IN= OUT=lo SRC=192.168.100.10 DST=192.168.100.10 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=43085 DPT=80 SEQ=307515611 ACK=0 WINDOW=32767 RES=0x00 SYN URGP=0 

And my regex is:

#!/usr/bin/python
import re
my_re = "(\w+\s\d+\s\d+:\d+:\d+).+SRC=([\d.]+)\s+DST=([\d.]+)\s(.*TTL=([\d]))+ID=([\d.]+)\s+(.*TCP|UDP)\s+SPT=(\d+)\s+DPT=(\d+)\s+SEQ=(\d+)\s+ACK=(\d+)"

# This one works, but doesn't match ID:
# my_re = "(\w+\s\d+\s\d+:\d+:\d+).+SRC=([\d.]+)\s+DST=([\d.]+)\s(.*TTL=([\d]))+ID=([\d.]+)\s+(.*TCP|UDP)\s+SPT=(\d+)\s+DPT=(\d+)\s+SEQ=(\d+)\s+ACK=(\d+)')"

pattern =  re.compile(r'(my_re)')
requests = {}
my_line=0
my_file = open('amaya')
for line in my_file.xreadlines():
        print line
        match = pattern.search(line)
        #if not match: continue
        #if not match:
        #       print "No"
        #       continue
        date     = match.group(1)
        src_addr = match.group(2)
        dst_addr = match.group(3)
        p_id     = match.group(5)
        src_port = int(match.group(7))
        dst_port = int(match.group(8))
        seq      = match.group(9)      # seq and ack are too big for int, they
        ack      = match.group(10)      # need long, so i left them as strings

        print match.groups()           # for debugging
        my_line = my_line + 1
my_file.close()

The first, uncommented regex finds nothing.
The commented one finds this:
[0, 'Aug 17', '20:41:55', '192.168.100.10', '192.168.100.10', 43085, 80, '307515611', '0']

I now want to match find the ID= field, but I don't seem to be able.
I have gone through http://py-howto.sourceforge.net/regex/ and couldn't find a
reason for my regex not to work...

-- 
Amaya M. Rodrigo Sastre       Genasys II Spain, S.A.U. 
MLS Sysadmin                    Ventura de la Vega, 5. 
Phone: +34.91.3649100              28014 Madrid. Spain