# matching and extracting...

Wed Jan 15 01:10:09 CET 2003

```On Tue, 2003-01-14 at 13:24, Shagshag wrote:
> Here is my problem : say i have items ix, some are "to discover" (must
> be extracted "-x"), some are "needed" (must be present "-p") and some
> are indifferent ("-?"). for example, i have sequence like :
>
> s = "a0 a1 a2 a3 a4 a5 a6 a7 a8"
>
> i would like to check if my sequence is matching sequence like :
>
> m = "a0-p i1-x i2-x i3-? a4-p i5-x i6-? i7-x i8-?"
>
> and get result like :
>
> m is matching s, i1 is a1, i2 is a3, i5 is a5, i7 is a7 (in python a
> "true" and a dict)

Is "i2 is a3" a typo, or am I missing something?

If it is a typo, then perhaps this will work for you:

import re

class matchseq:
def __init__(self, sequence):
self.items = {}
exp = ""
for item in sequence.split():
if exp:
exp += " "
k, v = item.split('-')
if v == 'p':
exp += "(?P<%s>%s)" % (k, k)
# self.items[k] = v
else:
exp += "(?P<%s>\D*\d*)" % k
if v == 'x':
self.items[k] = v

self.re = re.compile(exp)

def match(self, sequence):
match = self.re.match(sequence)
if match:
for k in self.items:
self.items[k] = match.group(k)
return self.items

m = matchseq("a0-p i1-x i2-x i3-? a4-p i5-x i6-? i7-x i8-?")

result = m.match("a0 a1 a2 a3 a4 a5 a6 a7 a8")

if result:
print result

--
Cliff Wells, Software Engineer
Logiplex Corporation (www.logiplex.net)
(503) 978-6726 x308  (800) 735-0555 x308

```