grimace: a fluent regular expression generator in Python

Joshua Landau joshua at landau.ws
Tue Jul 16 04:38:26 EDT 2013


On 15 July 2013 23:21, Ben Last <benlast at gmail.com> wrote:
> Hi all
>
> I'd be interested in comments on a fluent regular expression generator I've
> been playing with (inspired by the frustrations of a friend of mine who's
> learning).
>
> The general use case is to be able to construct RE strings such as:
>
> r'^\(\d{3,3}\)-{1,1}\d{3,3}\-{1,1}\d{4,4}$' (intended to match North
> American phone number)
>
> as:
>
> from grimace import RE
> north_american_number_re = (
>    RE().start
>    .literal('(').followed_by.exactly(3).digits.then.literal(')')
>    .then.one.literal("-").then.exactly(3).digits
>    .then.one.dash.followed_by.exactly(4).digits.then.end
>    .as_string()
> )

This looks really busy. How about something more like:

from grimace import RE, start, digits, dash, end
RE(start, "(", digits[3], ")-", digits[3], dash, digits[4], end).as_string()

?

and then you can do cool stuff like (Regex completely untested, I
hardly use the things):

RE((start | tab), (digits[:], inverse(dash)[4])[:2]) →
r"(^|\t)(\d*[^\-]{4,4}){0,2}"


> The intent is to provide clarity: since the strings would normally be
> generated and compiled when a module is first imported, there's minimal
> overhead.
>
> It's on github at https://github.com/benlast/grimace and the first blog post
> that explains it is here: http://benlast.livejournal.com/30871.html (I've
> added to it since then).
>
> Tests are the best explanation, and they're in the __init__ at the end so
> that they're self-executing if the init is executed.
>
> I'm thinking about other features to implement, and more use cases would be
> welcome.
>
> Cheers
> ben
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>



More information about the Python-list mailing list