Mastering Regular Expressions 2nd Ed.

Roy Smith roy at panix.com
Wed Jul 24 08:21:42 EDT 2002


Peter Hansen <peter at engcorp.com> wrote:
>> Regular expressions work much better if you use them for lexical
>> analysis rather than for parsing.
> 
> Would you please expand on that for those of us who are not computer
> scientists and/or who do not understand the implications of your
> statement?

In a nutshell, lexical analysis is figuring out how to break a file up 
into words and symbols (genericaly called "tokens"), and parsing is 
figuring out what those words mean.  So, if I were to write:

"Quick,defenistrate him!" :-)

lexical analysis would figure out that I've got the following tokens:

1) a quotation mark
2) the word "Quick"
3) a comma
4) the word "defenistrate"
5) the word "him"
6) an exclamation mark
7) a quotation mark
9) a smiley

At this point, I still have no idea what that line means, but at least 
I've broken it up into token that I can start to try an organize into 
higher level constructs like sentences and understand what those 
sentences mean.  That's parsing.



More information about the Python-list mailing list