a regular expression question

Nicola Paolucci durdn at yahoo.it.oops!.invalid
Sat Mar 22 07:11:27 EST 2003


Hi Luke,

Luke wrote:
> <a href="foo1">1</a> abc <a href="foo2">2</a> def <a href="foo3">3</a>
> ghi <a href="foo4">4</a> jkl
> 
> If I use re2, it works, but obviously only gets the odds since there
> is no overlapping.  Is there a way to modify re1 to get the text, or
> is there a way to overlap with python's re engine somehow?
>>>>re1 = re.compile("<a .*?>([0-9]+?)</a>(.*?)")
>>>>matches = re.findall(re1,text)
>>>>matches
> 
> [('1', ''), ('2', ''), ('3', ''), ('4', '')]

This worked for me:
 >>> re1 = re.compile("<a[^>]+>([0-9]+?)</a>([^<]*)")
 >>> print re.findall(re1,text)
[('1', ' abc '), ('2', ' def '), ('3', ' ghi '), ('4', ' jkl')]

Best regards,
	Nicola Paolucci

-- 
#Remove .oops!.invalid to email or feed to Python:
'Tmljb2xhIFBhb2x1Y2NpIDxuaWNrQG5vdGp1c3RjYy5jb20+'.decode('base64')





More information about the Python-list mailing list