Anomalous behaviour when compiling regular expressions?
Fredrik Lundh
fredrik at pythonware.com
Mon Mar 13 06:14:39 EST 2006
Harvey.Thomas at informa.com wrote:
> >>> import re
> >>> r = re.compile('(a|b*)+')
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> File "c:\python24\lib\sre.py", line 180, in compile
> return _compile(pattern, flags)
> File "c:\python24\lib\sre.py", line 227, in _compile
> raise error, v # invalid expression
> sre_constants.error: nothing to repeat
>
> but
>
> >>> r = re.compile('(a|b*c*)+')
> >>> r.match('def').group()
> ''
>
> Why is there a difference in behaviour between the two cases. Surely the
> two cases are equivalent to:
>
> >>> r = re.compile('(a|b)*')
> >>> r.match('def').group()
> ''
equivalent?
>>> re.match("(a|b*c*)", "abc").groups()
('a',)
>>> re.match("(a|b)*", "abc").groups()
('b',)
I have no time to sort out why your second example doesn't give the
same error (that might be a bug in the RE compiler), but no, a repeated
group with a min-length of 1 is not, in general, the same thing as a re-
peated group with a min-length of zero.
</F>
More information about the Python-list
mailing list