regular expression questions

Andrew M. Kuchling akuchlin at mems-exchange.org
Thu Mar 30 11:33:45 EST 2000


"Darrell" <darrell at dorb.com> writes:
> >>> s="""<a></a><a></a>"""
> >>> re.search("(<(?P<aTag>a)>.*</(?P=aTag)>)",s).groups()
> ('<a></a><a></a>', 'a')
> >>>
> Create a new version of (?P=aTag) like (?Pbalance=aTag) which returns
> ('<a></a>, 'a')

Note that you can already do something like this by using a non-greedy
.*? instead of .*, and then using the findall() method:

import re

p = re.compile("(<(?P<aTag>a)>(.*?)</(?P=aTag)>)")

s="""<a></a><a></a><b> foo </b>"""
m = p.match(s)
print m and m.__dict__
print p.findall(s)

The .findall() returns [('<a></a>', 'a', ''), ('<a></a>', 'a', '')].
But how would you want to receive the data where there's potential
recursion, as where the input is <a>data <a>data</a> more data</a>?

--	
A.M. Kuchling			http://starship.python.net/crew/amk/
The pretence that numbers are not the humble creation of man, but are the
exacting language of the Universe and therefore possess the secret of all
things, is comforting, terrifying and mesmeric.
  -- Peter Greenaway, _Fear of Drowning By Numbers_ (1988)





More information about the Python-list mailing list