[Mailman-Users] ISO speciific RegExp to filter/discard bot subscribe requests

Mark Sapiro mark at msapiro.net
Sun Aug 30 18:05:48 CEST 2015


On 08/30/2015 04:43 AM, Steven D'Aprano wrote:
> On Sun, Aug 30, 2015 at 02:06:26AM -0700, Nelson Kelly wrote:
>>
>> All the new spams appear to be of a slightly different format from which 
>> I described in the OP.
>>
>> blahblah+blah-blah-blah-blah-12345678 at gmail.com
>> blah_12_34+blah-blah-blah-blah-12345678 at hotmail.com


I'm now seeing these too.


> Try this regex instead:
> 
> ^.*\+.*?\d{3,}@
> 
> 
> The meaning of it is:
> 
> ^	start of string
> .*	any number of characters
> \+	a literal plus sign
> .*?	any number of characters (non-greedy)
> \d{3,}	at least three digits
> @	a literal at sign
> 
> 
> I'm not sure if the difference between "non-greedy" .*? and "greedy" .* 
> is important in this case.


It doesn't matter here. It would matter if there were groups. E.g.,

^.*\+(.*?)(\d{3,})@

In this case, the (.*?) group would match everything after the '+' up to
and not including the digits and the (\d{3,}) group would match all the
digits.

If the first group were greedy, i.e. (.*) without the ?, it would match
up to the last 3 digits and the (\d{3,}) group would match only the last
3 digits.

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan


More information about the Mailman-Users mailing list