Must be a bug in the re module [was: Why this result with the re module]
lanyjie at yahoo.com
Thu Nov 4 10:14:25 CET 2010
--- On Wed, 11/3/10, MRAB <python at mrabarnett.plus.com> wrote:
> The outer group is repeated, so it can match again, but the
> inner group
> can't match again because it captured all it could the
> previous time.
> Therefore the outer group matches and captures an empty
> string and the
> inner group remembers its last capture.
Thanks, I got it. Basically, '(.a.)*' matched an empty string
in the last outer group match, but not '(.a.)'.
Now what remains hard for me to figure out is the number
of matches: why is it 6 times with '((.a.)*)*' when
matched to 'Mary has a lamb'?
I think this is probably cuased by the limit of the
matchobject: this object does not say anything
about if an empty string is appended to the
matched pattern or not. Hence some of the empty
strings are repeated/overlapped by re.findall().
More information about the Python-list