search (match) backward

Thu Oct 28 00:07:02 EDT 1999

Tim Peters wrote:
> > This raises the interesting question of whether there is an
> > algorithm for reversing an RE (i.e. find an RE which matches
> > the reverse of the strings that the original RE matches).
> > Can it even be done in general? My intuition says yes, but
> > I can't think of a proof off the top of my head.

> The proof
> is easy if you look at machines instead:  build a state machine that accepts
> the language defined by the RE.  Build a derived machine with the same set
> of states, but invert the transitions, make the new machine's accepting
> states the original machine's start states, and make the new machine's start
> states the original machine's accepting states (special case here:  if the
> language is empty, there may not be any accepting states in the original
> machine).

In theory, yes, but I don't believe that REs can necessarily be
converted to state machines, particularly not DFAs. For example:

>(?P=name)
>Matches whatever text was matched by the earlier group named name.

This implies that the value is stored somewhere, and DFAs cannot do
that. \number is similar. I am not sure about these either:

>(?=...)
>Matches if ... matches next, but doesn't consume any of the string. This is called a >lookahead assertion. For example, Isaac (?=Asimov) will match 'Isaac ' only if it's >followed by 'Asimov'.
>(?!...)
>Matches if ... doesn't match next. This is a negative lookahead assertion. For example, >Isaac (?!Asimov) will match 'Isaac ' only if it's not followed by 'Asimov'.

I don't know whether there are other types of state machines that
have the memory to implement the RE specification properly, and are 
still susceptible to Tim's argument. However I believe for the subset of
REs which do not use the constructs mentioned above, Tim's argument is
correct.

John