[Python-ideas] Adding function checks to regex

Calvin Spealman ironfroggy at gmail.com
Sat Mar 19 17:35:34 CET 2011


I am -1 on the whole idea.

However, for the sake of argument, I'll say that if it was done I would not
bind the callbacks at match time.

Instead, they would be part of the compiled regex objects.

r = re.compile(r"foo:(?C<check_bounds>\d+)", check_bounds=lambda d: 1 <=
int(d) <= 100)

and then r could be used like any other regex, and you don't need to know
about the callbacks when actually using it, just to build it.

On Sat, Mar 19, 2011 at 12:19 PM, MRAB <python at mrabarnett.plus.com> wrote:

> On 19/03/2011 11:33, Peter Otten wrote:
>
>> MRAB wrote:
>>
>>  Some of those who are relative new to regexes sometimes ask how to write
>>> a regex which checks that a number is in a range or is a valid date.
>>> Although this may be possible, it certainly isn't easy.
>>>
>>>  From what I've read, Perl has a way of including code in a regex, but I
>>> don't think that's a good idea
>>>
>>> However, it occurs to me that there may be a case for being able to call
>>> a supplied function to perform such checking.
>>>
>>> Borrowing some syntax from Perl, it could look like this:
>>>
>>>      def range_check(m):
>>>          return 1<= int(m.group())<= 10
>>>
>>>      numbers = regex.findall(r"\b\d+\b(*CALL)", text, call=range_check)
>>>
>>> The regex module would match as normal until the "(*CALL)", at which
>>> point it would call the function. If the function returns True, the
>>> matching continues (and succeeds); if the function returns False, the
>>> matching backtracks (and fails).
>>>
>>
>> I would approach that with
>>
>> numbers = (int(m.group()) for m in re.finditer(r"\b\d+\b"))
>> numbers = [n for n in numbers if 1<= n<= 10]
>>
>> here. This is of similar complexity, but has the advantage that you can
>> use
>> the building blocks throughout your python scripts. Could you give an
>> example where the benefits of the proposed syntax stand out more?
>>
>>  There may be a use case in config files where you define rules (for
> example, Apache <FilesMatch>) or web forms where you have validation,
> but a regex is too limited. This would enable you to add 'richer'
> checking. There could be a predefined set of checks, such as whether a
> date is valid.
>
>
>  The function would be passed a match object.
>>>
>>> An extension, again borrowing the syntax from Perl, could include a tag
>>> like this:
>>>
>>>      numbers = regex.findall(r"\b\d+\b(*CALL:RANGE)", text,
>>> call=range_check)
>>>
>>> The tag would be passed to the function so that it could support
>>> multiple checks.
>>>
>>
>> [brainstorm mode]
>> Could the same be achieved without new regex syntax? I'm thinking of
>> reusing
>> named groups:
>>
>> re.findall(r"\b(?P<number>\d+)\b", text,
>>            number=lambda s: 1<= int(s)<= 10)
>>
>>  I'm not sure about that.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://techblog.ironfroggy.com/
Follow me if you're into that sort of thing:
http://www.twitter.com/ironfroggy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20110319/eab0a676/attachment.html>


More information about the Python-ideas mailing list