[Python-Dev] Possible bug in re module?

Leif Walsh adlaiff6 at gmail.com
Tue May 20 21:34:47 CEST 2008


On Tue, 20 May 2008, Dmitry Vasiliev wrote:
> I've just found a strange re behavior:
>
> >>> import re
> >>> re.sub("(?:ab|b|a)", "+", "cbacbabcabc")
> 'c++c++c+c'
> >>> re.sub("(?:ab|b|a){2}", "+", "cbacbabcabc")
> 'c+c+c+c'
>
> In the last case |-separated expressions seems don't tried from left to right.
> Is it bug or just me?

What were you expecting, 'c+c+cabc'?  The re engine should try
everything possible for a match with the entire re, not just match the
first thing it finds for each section.

-- 
Cheers,
Leif


More information about the Python-Dev mailing list