[Python-ideas] Complicate str methods
Steven D'Aprano
steve at pearwood.info
Sat Feb 3 20:09:19 EST 2018
On Sun, Feb 04, 2018 at 10:54:53AM +1100, Chris Angelico wrote:
> Picking up this one as an example, but this applies to all of them:
> the transformation you're giving here is dangerously flawed. If there
> are any regex special characters in the strings, this will either bomb
> with an exception, or silently do the wrong thing. The correct way to
> do it is (at least, I think it is):
>
> re.match("|".join(map(re.escape, strings)), testme)
>
> With that gotcha lurking in the wings, I think this should not be
> cavalierly dismissed with "just 'import re' and be done with it".
Indeed.
This is not Perl and "just use a regex" is not a close fit to the
culture of Python.
Regexes are a completely separate mini-language, and one which is the
opposite of Pythonic. Instead of "executable pseudo-code", regexes are
excessively terse and cryptic once you get past the simple examples.
Doing anything complicated using regexes is painful.
Even Larry Wall has criticised regex syntax for choosing poor defaults
and information density. (Rarely used symbols get a single character,
while frequently needed symbols are coded as multiple characters, so
Perlish syntax has the worst of both worlds: too terse for casual users,
too verbose for experts, hard to maintain for everyone.)
Any serious programmer should have at least a passing familiarity with
regexes. They are ubiquitous, and useful, especially as a common
mini-language for user-specified searching.
But I consider regexes to be the fall-back for when Python doesn't
support the kind of string matching operation I need, not the primary
solution. I would never write:
re.match('start', text)
re.search('spam', text)
when
text.startswith('start')
text.find('spam')
will do. I think this proposal to add more power to the string methods
is worth some serious consideration.
--
Steve
More information about the Python-ideas
mailing list