Making regex suck less

Carl Banks imbosol at vt.edu
Sun Sep 1 19:17:19 EDT 2002


Gerson Kurz wrote:
> So basically, couldn't one come up with
> a *human readable* syntax for re, and compile that instead? Python
> prides itself on its clean syntax, and human readability, an bang -
> import re, get perl-ish code instantly! 


You might want to look at the Plex package.  It defines patterns by
constructing data structures.  Something like this:

    symbol = Range("A-Za-z") + Any(Range("0-9A-Za-z") | Char("_"))

However, three points:

First, this will certainly be slower than regular expressions, since
there are many Python calls needed to build the structure. (Of course,
after you've compiled it, it can be as fast as regexps.)

Second, even if you use re module, it is still nowhere near Perl-ish
ugliness.  You still have Python's clean syntax outside of the
pattern.

Third, readability is not a unilateral good thing; conciseness is also
important, and sometimes opposed to readability.  Sacrificing a little
readability to get a lot of conciseness is usually a good thing.  I
think, as long as the regexp is not too obnoxious, it is probably
better to keep it concise.  (Of course, this depends a lot on what
you're doing and how flexible you need to be.)



-- 
CARL BANKS
http://www.aerojockey.com



More information about the Python-list mailing list