Don't want to do the regexp test twice

Bengt Richter bokr at oz.net
Thu Jul 24 22:17:24 EDT 2003


On Thu, 24 Jul 2003 23:44:57 +0200, Egbert Bouwman <egbert.list at hccnet.nl> wrote:

>While looping over a long list (with file records)
>I use an (also long) if..elif sequence.
>One of these elif's tests a regular expression, and 
>if the test succeeds, I want to use a part of the match.
>Something like this:
>
>pat = re.compile(r'...')
>for line in mylist:
>    if ... :
>        ....
>    elif ... :
>        ....
>    elif pat.search(line):
>        mat = pat.search(line)
>    elif ... :
>       ...
>    else ...:
>       ...
>Is there a way to to do this job with only one pat.search(line) ?
Hack #1 (list comprehension abuse):

 >>> import re
 >>> pat = re.compile(r'...')
 >>> mylist = ' a bc def ghij'.split(' ')
 >>> mylist
 ['', 'a', 'bc', 'def', 'ghij']
 >>> for line in mylist:
 ...     if 1==2: 'naw'
 ...     elif 3==4: 'neither'
 ...     elif [1 for mat in [pat.search(line)] if mat]:
 ...         print repr(mat), mat.groups(), mat.group()
 ...     elif 4==5: 'haw'
 ...     else:
 ...         print 'final else'
 ...
 final else
 final else
 final else
 <_sre.SRE_Match object at 0x007F1260> () def
 <_sre.SRE_Match object at 0x007F5900> () ghi

Hack #2: (attach temporary value to an instance of something
that can accept attributes (almost any object):

 >>> import re
 >>> pat = re.compile(r'...')
 >>> mylist = ' a bc def ghij'.split(' ')
 >>> mylist
 ['', 'a', 'bc', 'def', 'ghij']
 >>> obj = type('Any',(),{})()
 >>> for line in mylist:
 ...     if 1==2: 'naw'
 ...     elif 3==4: 'neither'
 ...     elif setattr(obj,'mat', pat.search(line)) or obj.mat:
 ...         print repr(obj.mat), obj.mat.groups(), obj.mat.group()
 ...     elif 4==5: 'haw'
 ...     else:
 ...         print 'final else'
 ...
 final else
 final else
 final else
 <_sre.SRE_Match object at 0x007F69C0> () def
 <_sre.SRE_Match object at 0x007F6980> () ghi

The trick above is that setattr(...) return None, so the expression always continues to the or part.
 >>> setattr(obj,'xxx',123)
 >>> repr(setattr(obj,'xxx',123))
 'None'

You can spell

 ...     elif setattr(obj,'mat', pat.search(line)) or obj.mat:
 ...         print repr(obj.mat), obj.mat.groups(), obj.mat.group()

a little slicker if you make a special object to hold a binding to
the pat.search result, e.g., h is the Holder instance in the following,
which remembers the last thing passed to it and immediately returns it,
and also returns that last thing on being called without an arg:

 >>> mylist
 ['', 'a', 'bc', 'def', 'ghij']
 >>> h = Holder()
 >>> for line in mylist:
 ...     if 1==2: 'naw'
 ...     elif 3==4: 'neither'
 ...     elif h(pat.search(line)):
 ...         print repr(h()), h().groups(), h().group()
 ...     elif 4==5: 'haw'
 ...     else:
 ...         print 'final else'
 ...
 final else
 final else
 final else
 <_sre.SRE_Match object at 0x007F6FC0> () def
 <_sre.SRE_Match object at 0x007F6F80> () ghi

Obviously
     ...         print repr(h()), h().groups(), h().group()
could have been
     ...         mat=h(); print repr(mat), mat.groups(), mat.group()
instead.

E.g., using the leftover value in h:

 >>> print repr(h()), h().groups(), h().group()
 <_sre.SRE_Match object at 0x007F6F80> () ghi
 >>> mat=h(); print repr(mat), mat.groups(), mat.group()
 <_sre.SRE_Match object at 0x007F6F80> () ghi


>Of course I can do this:
>
>for line in mylist:
>    mat = pat.search(line)
>    if  ...:
>        ....
>    elif ...:
>        ....
>    elif mat:
>        ...
>but the test is relevant in only a relatively small number of cases.    
>And i would like to know if there exists a general solution
>for this kind of problem.

Take a pick, but not the list comprehension ;-)

Regards,
Bengt Richter




More information about the Python-list mailing list