Tips to match multiple patterns from from a single file .
Ganesh Pal
ganesh1pal at gmail.com
Sun Jul 23 13:21:33 EDT 2017
I have hundreds of file in a directory from all of which I need to extract
multiple values namely filename with pathname (which start with test*),
1,1,25296896:8192 ( only the one containing pattern corrupting), before
corruption( it’s a hex value), offset(digit), size(digit)
Sample file contents ( All my files are small files ):
07/22/2017 12:34:28 AM INFO: --offset=18 --mirror=1 --path=/ifs/i/inode.txt
--size=4
07/22/2017 12:34:28 AM INFO:The mirror selected is 1,1,25296896:8192
07/22/2017 12:34:28 AM INFO:Data before corruption : 1b000100
07/22/2017 12:34:28 AM INFO:Corrupting disk object 6 at 1,1,25296896:8192
07/22/2017 12:34:28 AM INFO:Data after corruption : 00000000
I am expecting something like this
# Filename : /var/01010101/test01log object: 1,1,25296896:8192 checksum
: 1b000100 offset: 18 size:4
# Filename : /var/01010101/test03log object: 1,2,25296896:8192 checksum
: 1b200120 offset: 8 size:8
Here is how I have started coding this but not sure how to to group
multiple patterns and return it as a function , I am trying with group()
amd groupdicts() any tips and better idea
import glob
import re
for filename in sorted(glob.glob('/var/01010101/test*.log')):
with open(filename, 'r') as f:
for linenum, line in enumerate(f):
m = re.search(r'(Corrupting.*)',line)
if not m:
# uninteresting line
continue
x = m.group().split()
print filename , x[-1]
x123-45# python test.py
/var/01010101/test01_.log 1,1,25296896:8192
I am on Python 2.7 and Linux
Regards,
Ganesh
More information about the Python-list
mailing list