PyWart: Python regular expression syntax is not intuitive.

Rick Johnson rantingrickjohnson at gmail.com
Wed Jan 25 16:19:29 EST 2012


On Jan 25, 2:17 pm, Ian Kelly <ian.g.ke... at gmail.com> wrote:
> On Wed, Jan 25, 2012 at 10:16 AM, Rick Johnson

> Did you read the very first sentence of the re module documentation?
> "This module provides regular expression matching operations *similar
> to those found in Perl*" (my emphasis).  The goal here is
> compatibility with existing RE syntaxes, not readability.  Perl uses
> the (?...) syntax, so the re module does too.

@Duncan and Ian:
Did you not read the title of my post? :o) " Python regular expression
syntax is not intuitive." While i understand WHERE the syntax
orientations from, that fact does not solve the problem. The syntax is
not intuitive, and Python should ALWAYS be intuitive! We should always
borrow ideas from anyone (even our enemies) when those ideas support
our ideology. Perl style regexes are not Pythonic. They violate our
philosophy in too many places.

> > (?iLmsux) # Passing Flags Internally
> > This is ridiculous. re's are cryptic enough without inviting TIMTOWDI
> > over to play. Passing flags this way does nothing BUT harm
> > readability. Please people, pass your flags as an argument to the
> > appropriate re.method() and NOT as another cryptic syntax.
>
> 1) Not all regular expressions are hard-coded.  Some applications even
> allow users to supply regular expressions as data.  Permitting flags
> in the regular expression allows the user to specify or override the
> defaults set by the application.
>
> 2) Permitting flags in the regular expression allows different
> combinations of flags to be in effect for different parts of complex
> regular expressions.  You can't do that just by passing in the flags
> as an argument.

This is a valid argument, and i totally agree with you that we should
not remove the ability to pass flags internally. However, my main
point still stands strong (with a slight tweak). """Please people,
pass your flags as an argument to the appropriate re.method() and NOT
as another cryptic syntax, UNLESS YOU HAVE NO OTHER CHOICE!""" Thanks
for pointing this out.

> Regular expression reform is not necessarily a bad thing, but this is
> just forcing everybody to learn Yet Another Regex Syntax for no real
> purpose.

I disagree here.
Whist some people may be "die-hard" fans of the un-intuitive perl
regex syntax, i believe many, if not exponentially MORE people would
like to have a better alternative. Do i want to remove the current
"well established" re module? No. But i would like to create a new
regex module that is more pythonic. A regex module that we can be
proud of. And just maybe, a regex module that "sets the bar" for all
other regular expressions.

Listen. Backwards compatibility and cross pollination is wonderful
WHEN you can make it work. However, in the case of Perl regex syntax,
this is not a "cross pollination", this is a "cross pollution".

> All that you've changed here is window dressing.  For an
> overview of many of the *real* problems with regular expression
> syntax, see

Window dressing is important Ian, if not, then shop owners would not
continue to show displays in their shop windows. What does window
dressing do exactly? It attracts the masses, and without the masses
all merchants will eventually go out of buisness. Note: my argument
HAS NOTHING to do with the number of folks programming python (or any
language). The argument is focused on module sustainability in a
community. Modules that are morbidly DIFFICULT to learn do not last.

I know about PyParsing but i believe we have room for PyParsing and a
more Pythonic take on Perl style regular expressions. I don't see why
we could not keep all three. Let the people decide what is best for
them.

The greatest aspect of regexes is their compactness, and we should
keep them compact. And in that respect regexes will always be cryptic
to the neophyte.  However, regexes do not have to be a scourge to the
initiated. We must balance the compact and the intuitive nature of
regexes. But most importantly, we must understand that these aspects
of regexes are NOT mutually exclusive.



More information about the Python-list mailing list