Must be a bug in the re module [was: Why this result with the re module]

Yingjie Lan lanyjie at yahoo.com
Thu Nov 4 05:14:25 EDT 2010


--- On Wed, 11/3/10, MRAB <python at mrabarnett.plus.com> wrote:
> [snip]
> The outer group is repeated, so it can match again, but the
> inner group
> can't match again because it captured all it could the
> previous time.
> 
> Therefore the outer group matches and captures an empty
> string and the
> inner group remembers its last capture.

Thanks, I got it. Basically, '(.a.)*' matched an empty string
in the last outer group match, but not '(.a.)'.

Now what remains hard for me to figure out is the number
of matches: why is it 6 times with '((.a.)*)*' when 
matched to 'Mary has a lamb'?
I think this is probably cuased by the limit of the
matchobject: this object does not say anything
about if an empty string is appended to the
matched pattern or not. Hence some of the empty 
strings are repeated/overlapped by re.findall().

Regards,

Yingjie


      



More information about the Python-list mailing list