adalke at mindspring.com
Sat Nov 15 00:20:29 CET 2003
> The possibility of the same feature occurred to me. However, I'm still
> not sure if this would solve the problem. How would the "pivot point"
> be recognised by such an augmented regex engine? i.e. how would it
> recognise the point at which it should stop capturing, reverse the
> sequence and start matching again?
Back-tracking. Suppose there was such a thing as
where the \~1 matches the reverse of the first group.
Now match that pattern against
The (.*) matches all the way up to "NOON" then tries and
failes to match both the .? and the \~1
It backtracks one so that
(.*) matches "NOO"
and the N remains. The .? matches the N but the \~1 does not
so it backtracks so that the .? matches nothing, but still the \~1
does not match the N.
It backtracks another so that
(.*) matches "NO"
and the "ON" remains. The .? first matches with the "O" leaving
the "N", but \~1 doesn't match that, so it backtracks so that
the .? matchs nothing, leaving "ON" for the \~1. This matches!
In other words, it goes through a lot of work to do the matching,
and would take O(N) backtracks to work.
But that's not quite correct. An implementation is free to
analyze the pattern and notice that it's being asked to search
for a palindrome then insert special case code for that pattern.
> Perhaps one would need to also implement a feature whereby the length
> of the entire string could be made available within expressions, so
> that the size of a capture group could be limited to the first half of
> the string? I.E. Something along the lines of
Yup. That would work as well. There are all sorts of
special things one could capture in a regex-like language.
Perl 'solves' it by allowing any match to call out to embedded
Perl code, which is free to tell the matcher to stop or
> One of these days I'll find the time to dig out my old course notes
> and books :#)
Friedl's regexp book (from ORA) is quite good.
dalke at dalkescientific.com
More information about the Python-list