match.groupdict() into a single dict
MRAB
python at mrabarnett.plus.com
Wed Apr 19 15:49:54 EDT 2017
On 2017-04-19 14:26, Ganesh Pal wrote:
> Hello friends,
>
> I am learning regex and trying to use this to my scripts I need some
> suggestion on the below code. I need to match all lines of a file that
> have a specific pattern
> and return them as a dictionary.
>
> Sample line:
>
> 'NODE=ADAM-11: | TIME=2017-04-14T05:27:16-07:00 | COND=Some lovely message
> | MSG=attempt to record { addr=1,0,17080320:8192 action=xxhello-hell
> o owner=1:0070:001a::HEAD }, but history information has a different owner:
> owner: 1:0064:0005::HEAD, actions (new->old): { hello-hello
> * 1, none, none, hello-hello * 1, none, none, hello-hello * 1, none, none,
> hello-hello * 1, none, none, hello-hello * 1, none, hello-h
> ello * 1, none } bh hello_cookie: 8:hello-only bhv | LINSNAP=None | MAP=none
>
>
>
> with open("/tmp/2.repo","r") as f:
> for line in f:
> result = re.search(r'MSG=attempt to record(.*)LINSNAP', line)
> if result:
> pdb.set_trace()
> for pattern in [ r'(?P<Block>(\d+,\d+,\d+:\d+))',
>
> r'(?P<p_owner>([0-9a-fA-F]+:[0-9a-fA-F]+:[0-9a-fA-F]+::HEAD))',
>
> r'(?P<a_owner>(owner:\s+[0-9a-fA-F]+:[0-9a-fA-F]+:[0-9a-fA-F]+::HEAD))',
> ]:
> regex = re.compile(pattern)
> match = regex.search(line)
> print ' ', match.groupdict()
>
> sample o/p:
>
> {'Block': '1,0,17080320:8192'}
> {'p_owner': '1:0070:001a::HEAD'}
> {'a_owner': 'owner: 1:0064:0005::HEAD'}
>
> Questions
>
> 1. I was expecting a single dictionary with all matches every a line ,
> something like below
>
> {'Block': '1,0,17080320:8192', 'p_owner': '1:0070:001a::HEAD','a_owner':
> 'owner: 1:0064:0005::HEAD'}
>
> (a) I am thinking to add these element {'Block': '1,0,17080320:8192'}
> , {'p_owner': '1:0070:001a::HEAD'} ... etc to new dictionary
>
> (b) or some better regex may be the for loop is not needed and complied
> pattern can be better.
>
>
> I am a Linux user and on Python 2.7 , Thanks in advance :)
>
Why would you expect a single dictionary? You're doing 3 separate matches!
You could just combine the patterns as alternatives:
# The alternatives are matched repeatedly. The final '.' alternative
# will consume a character if none of the previous subpatterns match,
# ready for the next repeat.
subpatterns = [r'(?P<Block>(\d+,\d+,\d+:\d+))',
r'(?P<p_owner>([0-9a-fA-F]+:[0-9a-fA-F]+:[0-9a-fA-F]+::HEAD))',
r'(?P<a_owner>(owner:\s+[0-9a-fA-F]+:[0-9a-fA-F]+:[0-9a-fA-F]+::HEAD))',
'.']
pattern = '(%s)*' % '|'.join(subpatterns)
match = re.search(pattern, line)
print ' ', match.groupdict()
More information about the Python-list
mailing list