[Mailman-Users] Regexp for blocking addresses

Matthew Saltzman mjs at clemson.edu
Mon Sep 28 21:04:16 CEST 2015


On Thu, 2015-09-24 at 20:57 -0500, Mark Sapiro wrote:
> On 9/24/15 1:47 PM, Matthew Saltzman wrote:
> > 
> > I am trying to block variants of certain gmail addresses but I'm
> > having
> > trouble concocting the right rexexp to accomplish the task.
> > 
> > Gmail addresses can contain embedded periods and can be followed by
> > a
> > '+' and an arbitrary suffix. So all the following are the same
> > address:
> >       * joebloe at gmail.com
> >       * joe.blow at gmail.com
> >       * j.o.e.blow at gmail.com
> >       * joe.blow+abcd at gmail.com
> 
> 
> In my prior reply
> <https://mail.python.org/pipermail/mailman-users/2015-September/07985
> 6.html>,
> I focused on your literal question and answered accordingly, but it
> occurs to me that you are trying to deal with bot generated
> subscriptions of addresses of the form word.word+digits at gmail.com.
> While
> this pattern is the most common one I've seen, not all addresses are
> like that. They are in different domains and while all gmail
> addresses
> may have dots, not all addresses do and a rare few have had non
> -digits
> after the +, but all I've seen have at least 5 digits following a +
> and
> immediately preceding the @.
> 
> For the lists @python.org, we are now using
> 
> ^.*\+.*\d{3,}@
> 
> For the history, see
> <https://mail.python.org/pipermail/mailman-users/2015-August/079668.h
> tml>,
> <https://mail.python.org/pipermail/mailman-users/2015-September/07982
> 9.html>
> and
> <https://mail.python.org/pipermail/mailman-users/2015-September/07984
> 4.html>
> and other posts in those threads.

Looking back over this thread, I picked up on this. It is a bit more
aggressive than I was looking for, but probably works with high
probability.

When I encountered the original issue, I had enough evidence to find
the exact set of addresses that were causing the problem on my server.
After I banned those, the same addresses started showing up with
embedded periods.

FIY, the ones I found were:

 * ^nkymtky+.*@gmail\.com
 * ^kihuwzl+.*@gmail\.com
 * ^kihuotter+.*@gmail\.com
 * ^hulexchan+.*@gmail\.com
 * ^ewnetwork+.*@gmail\.com
 * ^damofah+.*@gmail\.com
 * ^bustysarahrae+.*@gmail\.com
 * ^vujovich+.*@usc\.edu
 * ^yesboobsofficial+.*@gmail\.com
 * ^yowesephth+.*@gmail\.com
 * ^ewnetwork2+.*@gmail\.com
 * ^nwplayer123+.*@gmail\.com

So I guessed that if I could just block those (with embedded periods),
I'd have the issue covered. Have others seen other addresses?

BTW, the part after the '+' in all cases I've seen have been only
digits. That might be a better way to go than any three or more
characters if one wanted to be as precise as possible.

Thanks for your help.

-- 
Matthew Saltzman
Clemson University Math Sciences
mjs AT clemson DOT edu


More information about the Mailman-Users mailing list