[Python-Dev] Finding overlapping matches with re assertions: bug or feature?

Tim Peters tim.peters at gmail.com
Fri Nov 15 22:44:56 CET 2013


[Tim]
>> Is that a feature?  Or an accident?  It's very surprising to find a
>> non-empty match inside an empty match (the outermost lookahead
>> assertion).

[Paul Moore]
> Personally, I would read (?=(R))" as finding an empty match at a point
> where R starts. There's no implication that R is in any sense "inside"
> the match.
>
> (?=(\<\w\w\w\w\w\w)\w\w\w) finds the first 3 characters of words that
> are 6 or more characters long. Once again, the lookahead extends
> beyond the extent of the main match.
>
> It's obscure and a little bizarre, but I'd say its intended and a
> logical consequence of the definitions.

After sleeping on it, I woke up a lot less surprised.  You'd think
that after decades of regexps, I'd be used to that by now ;-)

Thanks for the response!  Your points sound valid to me, and I agree.


More information about the Python-Dev mailing list