[Baypiggies] More dramatic material?
K. Richard Pixley
rich at noir.com
Wed Feb 24 04:13:24 CET 2010
Regular expressions aren't particularly fast. That's not the reason for
using them. In fact, there was a sort of competition going back in the
80's with various types of grep's. Look up boyer-moore, (bgrep), and
the difference between egrep and fgrep. The bottom line is that for
certain kinds of expressions, different algorithms are more efficient.
But the more efficient algorithms trade robustness and problem domain
for speed. That is, they work faster, but only on a subset of cases.
On the others, they either don't work at all or silently produce
incorrect results.
Similarly, LALR(0) languages are easier and faster to parse than
LALR(1). You don't really need a parser generator like yacc, bison, or
byacc if your language is LALR(0).
There's a related, but somewhat more subdued competition going on right
now about binary diff programs, which have many related properties. So
do compression programs and I think everyone's probably aware of that
long standing competition. And I'll only mention message digesting,
(aka, encryption), which also has similarities.
Anything done in C is faster than anything done in python, (assuming
competent implementations). The exceptions being things where "pure"
python is implemented in C anyway.
The win to regular expressions is not speed, but rather that they can
search for and find things that other types of searching can't, or can't
do easily. For "dramatic" examples, try searching for palindromes, or
repeated phrases like 'moinmoin' or 'wikiwiki', or sequences of double
repeating characters like vaccuum.
--rich
More information about the Baypiggies
mailing list