[Mailman-Developers] mailman email harvester

Bernhard Kuemel bernhard at bksys.at
Wed Feb 9 12:08:12 CET 2005


This mail got blocked at first because I was not subscribed with my 
current email address.

Some of you may remember that I announced the release of the 
harvester script. We discussed the issue and as the result was, not 
to change mailman, I now released the script to raise public 
awareness of the problem.

Bernhard

-------- Original Message --------
Subject: mailman email harvester
Date: Mon, 07 Feb 2005 23:48:44 +0100
From: Bernhard Kuemel <bernhard at bksys.at>
To: full-disclosure at lists.netsys.com,  bugtraq at securityfocus.com,
mailman-developers at python.org

Hi!

Tons of email addresses from mailman mailing lists are vulnerable to
be collected by spammers.

They are "protected" by obfuscation (user at example.com -> user at
example.com) and access to the subscriber list can be restricted to
subscribers. The obfuscation is trivially reversed and harvester
scripts can subscribe to gain access to restricted lists.

I suggested a graphical turing test that would bar scripts but the
mailman developers argued spammers might hire a couple of temps that
would solve the test as it already happened for the creation of
email accounts. The only solution would be not to have the desired
information available. This is already an option by restricting
access to the member list to the list administrator.

However, still many lists either have the member list openly
published, or available to the list members. To raise awareness to
this issue I wrote a script that collects addresses from openly
accessible lists. It stops after processing 1000 (the maximum
allowed) search results from google and collects 76772 email
addresses (61124 unique). It is attached as mmxp1.

An improved version that collects addresses that are restricted to
subscribers, processes more lists and works more parallelized is
planned.

Bye, Bernhard
-------------- next part --------------
#!/usr/bin/perl -w

#http://www.google.com/search?q=%22list+is+only+available+to+the+list+members%22+mailman/listinfo&start=600&num=100
#2.1.4 "current archive" "private list which" mailman/listinfo site:org

$n=0;
$u=0;
for ($i=0;1;$i+=10) {
	$#urls=-1;
	$google=`wget -qO - -U 'any browser' 'http://www.google.com/search?q=%22Click+here+for+the+list%22+mailman%2Flistinfo&start=$i'`;
#	print $google;
	@urls=($google=~m*<p class=g><a href=(http://\S+?)>*g);
#	print join("\n", at urls);
	if ($#urls==-1) {last;}
#	print "\naoeu $#urls\n";
	
	foreach $url (@urls) {
		$u++;
		$url=~s*/listinfo/*/roster/*;
		print STDERR "$url...\n";
		$roster=`lynx -connect_timeout=10 -dump $url`;
	#	print $roster;
		@mails=$roster=~/^ +\* \(?\[\d+\](.* at .*?)\)?$/mgo;
		foreach $mail (@mails) {
			$mail=~s/ at /@/;
			print "$mail\n";
			$n++;
		}
	print STDERR "mails=".($#mails+1).", total=$n, url=$u, google=$i\n";
#		exit;
	} #foreach url

} #while google



More information about the Mailman-Developers mailing list