If you (Mailman site operators) have a spare moment, please try running this:
------------cut here--------------
#!/bin/sh
cd /var/local/mailman/logs
egrep "pending [a-z]+ <[a-z]+(a)[a-z]+\.com>" subscribe \
| egrep -v "@gmail.com" \
| egrep -v "@hotmail.com" \
| egrep -v "@msn.com" \
| egrep -v "@aol.com" \
| egrep -v "@yahoo.com" \
| sed -e "s/(.*pending//"
------------cut here--------------
This is a first-cut, mildly sloppy script that will try to match some
patterns of interest that I've noticed in my "subscribe" log and that
might be in yours. The egrep clauses are in there to throw away data
not of interest; the sed snips off the mailing list name and some other
irrelevancies.
Here is what the last 10 lines of its output look like on my system:
Jun 06 00:14:32 2014 ehkfioxlkrr <yujwjs(a)zwdxgc.com> 62.210.226.131
Jun 06 13:23:16 2014 norchmecn <stydst(a)zdddmk.com> 86.51.26.20
Jun 07 02:06:20 2014 eljult <qbprgi(a)wabtdh.com> 86.51.26.11
Jun 07 13:21:20 2014 dvlevbpj <drksji(a)nlcvek.com> 210.14.138.102
Jun 07 15:41:10 2014 sdbdelkv <mtpdky(a)ghazhc.com> 86.51.26.18
Jun 07 16:17:10 2014 yqrebrgipo <ubnpwl(a)cgtnki.com> 86.51.26.20
Jun 08 06:37:12 2014 cihjwn <soudms(a)bprryw.com> 202.143.148.58
Jun 08 06:55:47 2014 ehxvwgrboo <iouwxm(a)mnaisa.com> 86.51.26.21
Jun 08 23:47:58 2014 qqpluym <jpbcnw(a)qkvfdi.com> 190.14.219.166
Jun 09 16:44:15 2014 mloepuj <figjdt(a)jjxlcu.com> 172.245.142.194
This is forged gibberish, of course. The user real name is always a
lowercase alpha string. The email address is also, both LHS and RHS,
and the TLD is always .com. (Hence the regexp in the first egrep.)
I'm curious. First, is anybody else seeing these? Second, does
anyone have a theory as to their purpose? And third, is there any
value in combining data to see if patterns emerge? (I have some
privacy concerns about that last one, since real email addresses
might leak through, so I suspect if we decided to do that, it would
be best to remove everything but the timestamp and IP address. I doubt
the gibberish has any real explanatory value anyway.)
---rsk