[Python-ideas] Re: Regex timeouts

Feb. 15, 2022

      Tim Peters writes:
...
Chris didn't say this, but I will: I'm amazed that things much
_simpler_ than regexps, like his scanf and REXX PARSE
examples, haven't spread more.
scanf just isn't powerful enough.  For example, consider parsing user
input dates: scanf("%d/%d/%d", &year, &month, &day).  This is nice and
simple, but handling "2022-02-15" as well requires a bit of thinking
and several extra statements in C.  In Python, I guess it would
probably look something like

    year, sep1, month, sep2, day = scanf("%d%c%d%c%d")
    if not ('/' == sep1 == sep2 or '-' == sep1 == sep2):
        raise DateFormatUnacceptableError
    # range checks for month and day go here

which isn't too bad, though.  But

    year, month, day = re.match(r"(\d+)[-/](\d+)[-/](\d+)").groups()
    if not sep1 == sep2:
        raise DateFormatUnacceptableError
    # range checks for month and day go here

expresses the intent a lot more clearly, I think.  Sure, it's easy to
write uninterpretable regexps, but up to that point regexps are very
expressive.  And that example can be reduced to one line (plus the
comment) at the expense of a less symmetric, slightly less readable
expression like r"(\d+)([-/])(\d+)\2(\d+)".  Some folks might like
that one better.
...
Simple solutions to simple problems are very appealing to me.
The Zawinski quote is motivated by the perception that people seem to
think that simplicity lies in minimizing the number of tools you need
to learn.  REXX and SNOBOL pattern matching quite a bit more
specialized to particular tools than regexps.  That is, all regexp
implementations support the same basic language which is sufficient
for most tasks most programmers want regexps for.

I think you'd need to implement such a facility in a very popular
scripting language such as sh, Perl, or Python for it to have the
success of regexps.
...
Although, to be fair, I get a kick too out of massive overkill ;l-)
Don't we all, though?

Steve

[Python-ideas] Re: Regex timeouts

Stephen J. Turnbull