How to get positions of multiple RE matches?

Alex Martelli aleaxit at yahoo.com
Fri Apr 6 10:48:45 EDT 2001


"Karl Schmid" <schmid at ice.mpg.de> wrote in message
news:9akdum$2ve8$1 at gwdu67.gwdg.de...
> I would like to get the all positions of a pattern match in a string.
> Is there a more elegant solution to this problem than the following one?

Depends on your definition of 'elegance', I guess, but...:

class Matches:
    def __init__(self):
        self.matches = []
    def __call__(self, mo):
        self.matches.append(mo.start())
        return ''

import re
a = '''This module provides regular expression matching
operations similar to those found in Perl'''

p = re.compile('(o)')
matches = Matches()
junk = p.sub(matches, a)

for match in matches.matches:
    print match



Some would call it "elegant", some would call it "tricky"
(because the .sub is being used only for its side effects,
and its result is uninteresting and gets discarded).


Anyway, the .sub method is the only documented way I know
to get a *call-back* for every non-overlapping match, "at a
single stroke", without explicitly programming a loop
yourself.  (The "non-overlapping" feature may not be
what you're looking for, of course, in which case this
approach is certainly ungood).


If I did decide I had better program a loop (e.g., I also
want matches that would overlap with others), and I'm not
being paranoic about performance (if I needed to be, I
would carefully time each possible alternative), I think
I'd code it the simplest way I can think of...:

    for i in range(len(a)):
        if p.match(a[i:]):
            print i


Alex






More information about the Python-list mailing list