reusing parts of a string in RE matches?
peace.is.our.profession at gmx.de
Thu May 11 12:18:38 CEST 2006
Hi mpeters42 & John
> With a more complex pattern (like 'a.a': match any character between
> two 'a' characters) this will get the length, but not what character is
> between the a's.
Lets take this as a starting point for another example
that comes to mind. You have a string of characters
interspersed with numbers: tx = 'a1a2a3A4a35a6b7b8c9c'
Now you try to find all _numbers_, which have
symmetrical characters (like a<-2->a) which
are not in 3/3/3... synced groups.
This can easy be done in P(ytho|nerl) etc. by
positive lookahead (even the same pattern does:)
tx = 'a1a2a3A4a35a6b7b8c9c'
rg = r'(\w)(?=(.\1))'
print re.findall(rg, tx)
$_ = 'a1a2a3A4a35a6b7b8c9c';
(should find 1,2,7,9 only, python regex
written to var in order to prevent
clunky lines ;-)
BTW, Py Regex Engine seems to
be very close to the perl one:
Naive (!) matching of a pattern
with 14 o's (intersperded by
anything) against a string of
16 o's takes about exaclty the same
time here in Py(2.4.3) and Pe (5.8.7):
tl = 'oooooooooooooooo'
rg = r'o*o*o*o*o*o*o*o*o*o*o*o*o*o*[\W]'
print re.search(rg, tl)
Py: 101 sec
Pe: 109 sec
(which would find no match because there's
no \W-like character at the end of the
More information about the Python-list