On 29.01.17 12:18, Jakub Wilk wrote:
* Armin Rigo
, 2017-01-28, 12:44: The theoretical kind of regexp is about giving a "yes/no" answer, whereas the concrete "re" or "regexp" modules gives a match object, which lets you ask for the subgroups' location, for example. Strange at it may seem, I am not aware of a way to do that using the linear-time approach of the theory---if it answers "yes", then you have no way of knowing *where* the subgroups matched.
Another issue is that the theoretical engine has no notion of greedy/non-greedy matching.
RE2 has linear execution time, and it supports both capture groups and greedy/non-greedy matching.
The implementation is explained in this article: https://swtch.com/~rsc/regexp/regexp3.html
Not all features of Python regular expressions can be implemented with linear complexity. It is possible to compile the part of regular expressions to the implementation with linear complexity. Patches are welcome.