[Python-Dev] Finding overlapping matches with re assertions: bug or feature?
Tim Peters
tim.peters at gmail.com
Fri Nov 15 22:44:56 CET 2013
[Tim]
>> Is that a feature? Or an accident? It's very surprising to find a
>> non-empty match inside an empty match (the outermost lookahead
>> assertion).
[Paul Moore]
> Personally, I would read (?=(R))" as finding an empty match at a point
> where R starts. There's no implication that R is in any sense "inside"
> the match.
>
> (?=(\<\w\w\w\w\w\w)\w\w\w) finds the first 3 characters of words that
> are 6 or more characters long. Once again, the lookahead extends
> beyond the extent of the main match.
>
> It's obscure and a little bizarre, but I'd say its intended and a
> logical consequence of the definitions.
After sleeping on it, I woke up a lot less surprised. You'd think
that after decades of regexps, I'd be used to that by now ;-)
Thanks for the response! Your points sound valid to me, and I agree.
More information about the Python-Dev
mailing list