[Python-ideas] Re: Regex timeouts

Feb. 15, 2022

      On Wed, 16 Feb 2022 at 01:54, Stephen J. Turnbull
<stephenjturnbull@gmail.com> wrote:
...
The Zawinski quote is motivated by the perception that people seem to
think that simplicity lies in minimizing the number of tools you need
to learn.  REXX and SNOBOL pattern matching quite a bit more
specialized to particular tools than regexps.  That is, all regexp
implementations support the same basic language which is sufficient
for most tasks most programmers want regexps for.
The problem is that that's an illusion. If you restrict yourself to
the subset that's supported by every regexp implementation, you'll
quickly find tasks that you can't handle. If you restrict yourself to
what you THINK is the universal subset, you end up with something that
has a subtle difference when you use it somewhere else (I've had this
problem with grep and Python, where a metacharacter in one was a plain
character in the other - also frequently a problem between grep and
sed, with the consequent "what do I need to escape?" problem).

But as the OP has found, regexps are a hammer that, for some nail-like
problems, will whack an arbitrary number of times before hitting. So I
guess the question isn't "why are regular expressions so popular" but
"why are other things not ALSO popular". I honestly think that scanf
parsing, if implemented ad-hoc by different programming languages and
extended to their needs, would end up no less different from each
other than different regexp engines are - the most-used parts would
also be the most-compatible, just like with regexps.

ChrisA

[Python-ideas] Re: Regex timeouts

Chris Angelico