[Tutor] Python RE uses DFA or NFA for string check?

Tue Jan 10 18:12:18 CET 2006

Thanks Mr.Tim. That was helpful :)

On 1/10/06, Tim Peters <tim.peters at gmail.com> wrote:
>
> [Intercodes]
> > This question is just out of curiosity. I am working with this dragon
> book.
> > From what I have learnt so far, RE uses either NFA or DFA to check
> whether
> > the string is accepted or not. (Correct?)
>
> In the world of "computer science" regular expressions, yes.  But the
> things _called_ "regular expressions" in programming languages are
> generally richer than those.  For example, almost all regexp
> implementations support backreferences, and backreferences allow
> recognizing languages that computer-science regexps cannot.  For
> example,
>
>     ^(a*)b+\1$
>
> recognizes strings that begin and end with the same number of a's,
> separated by one or more b's.  It's "the same number" part that's
> beyond a pure regexp's abilities.
>
> > So what does the Python's RE module use to check the correctness of the
> > string, NFA or DFA?
>
> Neither, but it's much closer to NFA than to DFA.  Most regexp
> implementations in most languages supporting such a thing are
> implemented via backtracking search.  Jeffrey Friedl's "Mastering
> Regular Expressions" is more useful than the dragon book if you want
> insight into how most programming-language regexp implementations
> actually work:
>
>     http://www.oreilly.com/catalog/regex/
>
> To increase confusion ;-), Friedl calls backtracking search "NFA" in that
> book.
>

--
Intercodes
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/tutor/attachments/20060110/195b118c/attachment.html