grimace: a fluent regular expression generator in Python
ben at benlast.com
Wed Jul 17 04:33:17 CEST 2013
On 16 July 2013 20:48, <python-list-request at python.org> wrote:
> From: "Anders J. Munch" <2013 at jmunch.dk>
> Date: Tue, 16 Jul 2013 13:38:35 +0200
> Ben Last wrote:
>> north_american_number_re = (RE().start
> Very cool. It's a bit verbose for my taste, and I'm not sure how well it
> will cope with nested structure.
I guess verbosity is the aim, in that *explicit is better than implicit* :)
And I suppose that's one of the attributes of a fluent system; they tend
to need more typing. It's not Perl...
> The problem with Perl-style regexp notation isn't so much that it's terse
> - it's that the syntax is irregular (sic) and doesn't follow modern
> principles for lexical structure in computer languages. You can get a long
> way just by ignoring whitespace, putting literals in quotes and allowing
> embedded comments.
Good points. I wanted to find a syntax that allows comments as well as
.any_number_of.digits # Recall that any_number_of includes zero
.followed_by.an_optional.dot.then.at_least_one.digit # The dot is
# but we must have one digit as a minimum
... and yes, I aso specifically wanted to have literals quoted.
Nested groups work, but I haven't tackled lookahead and backreferences :
essentially because if you're writing an RE that complex, you should
probably be working directly in RE strings.
Depending on what you mean by "nested", re-use of RE objects is easy
(example from the unit tests):
identifier_start_chars = RE().regex("[a-zA-Z_]")
identifier_chars = RE().regex("[a-zA-Z0-9_]")
Thanks for the comments!
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-list