Must be a bug in the re module [was: Why this result with the re module]
John Bond
lists at asd-group.com
Tue Nov 2 23:55:44 EDT 2010
> Could you please reconsider how would you
> work with this new one and see if my steps
> are correct? If you agree with my 7-step
> execution for the new regex, then:
>
> We finally found a real bug for re.findall:
>
>>>> re.findall('((.a.)*)*', 'Mary has a lamb')
> [('', 'Mar'), ('', ''), ('', ''), ('', 'lam'), ('', ''), ('', '')]
>
>
> Cheers,
>
> Yingjie
>
>
>
Nope, I'm afraid it is lack of understanding again.
The outer capturing group that you've added is matching the entirety of
what's matched by the inner one (which is six matches, that you now
accept). Because it only returns the last of them, it returns one thing
- an empty string (that being the last thing that the inner group
matched). Findall is simply returning that in each of the six return
values it needs to return because of the inner one.
You just need to accept that findall (like all of re) works fine, and if
it doesn't seem to do what you expect, it's because the expectation is
wrong.
Cheers, JB
More information about the Python-list
mailing list