spammers harvesting email'ids [was] UI for Mailman 3.0 update
On Fri, Jun 4, 2010 at 22:43, Mark Sapiro <mark@msapiro.net> wrote:
Ian Eiloart wrote:
Well, maybe, but I've had to switch on approval for various lists because of subscribing spammers.
As Barry suggests, setting moderation of new members as the default can also thwart the subscribing spammers.
A smart spammer would hardly post to the mailing list --atleast on linux lists its asking to be moderated or kicked out, depending on the admins.
I've checked out spammers who mass subscribe to the lists at Debian, Ubuntu, Fedora and RH, to access the email id's of all the subscribers (if this is set as available only to the list members). If the settings are "membership list is available only to the *-owner", the spammer can still subscribe, and lurk on the list to silently harvest the email id's of all the folks who post mails to any list they lurk on. The latter can be identified by checking their TLD when they sub to the list but if they use *-free-email-provider like a gmail or yahoo address to sub and lurk, its hard to tell.
-- thanks and regards, vid || http://svaksha.com
स्वक्ष wrote:
I've checked out spammers who mass subscribe to the lists at Debian, Ubuntu, Fedora and RH, to access the email id's of all the subscribers (if this is set as available only to the list members).
This is pointless, as you can harvest contributors' email addresses from web-based archives for most of these lists even without being a subscriber:
http://lists.debian.org/debian-user/2010/06/msg00000.html
-Julian
On Jun 05, 2010, at 04:21 PM, स्वक्ष wrote:
On Fri, Jun 4, 2010 at 22:43, Mark Sapiro <mark@msapiro.net> wrote:
Ian Eiloart wrote:
Well, maybe, but I've had to switch on approval for various lists because of subscribing spammers.
As Barry suggests, setting moderation of new members as the default can also thwart the subscribing spammers.
A smart spammer would hardly post to the mailing list --atleast on linux lists its asking to be moderated or kicked out, depending on the admins.
At the very least, we want to make it has hard as possible for spammers to spam people *through* a mailing list.
I've checked out spammers who mass subscribe to the lists at Debian, Ubuntu, Fedora and RH, to access the email id's of all the subscribers (if this is set as available only to the list members). If the settings are "membership list is available only to the *-owner", the spammer can still subscribe, and lurk on the list to silently harvest the email id's of all the folks who post mails to any list they lurk on. The latter can be identified by checking their TLD when they sub to the list but if they use *-free-email-provider like a gmail or yahoo address to sub and lurk, its hard to tell.
We can try to make it more difficult to harvest email address from mailing list archives and posts, but some of that is fairly difficult without disrupting the usability of the mailing list.
-Barry
On Mon, Jun 07, 2010 at 02:28:22PM -0400, Barry Warsaw wrote:
At the very least, we want to make it has hard as possible for spammers to spam people *through* a mailing list.
With that in mind, I've been reminded about posting a mail I've been meaning to write ;)
It's quite common, in my set-ups, at least, for me to allow a
^[^@]+@(.*\.)?example\.org$
wildcard for allowing posting by non-members -- from "our" domain(s).
Recently, I changed the regexp over to
^[^@]+@example\.org$
as I've noticed the horrible trend for spammers to post from various addresses purporting to be from the lists.example.org subdomain.
The current "problem", is the order in which MM2 handles its non-members filters; and I guess what I'd welcome is an ability to finely control the order in which given rules are processed; I think that would help immensely.
So, perhaps, something like:
-->>- ex 1 ->>--
Posting Settings for List X on lists.example.org:
.-----------------------------+---------+--------+----------+-------------+---------. | email-address | allow | hold | reject | blackhole | order | +-----------------------------+---------+--------+----------+-------------+---------+ | list-x@list.example.org | | | | X | 1 | | foo@list.example.org | | | | X | 2 | | ^[^@]+@(.*\.)?example\.org$ | X | | | | last | '-----------------------------+---------+--------+----------+-------------+---------'
--<<- ex 1 -<<--
where the order setting ('n', 'first', 'last') has effect on how the rules are processed.
(so in this example, the 'global' wildcard for the entire DNS-space example.org is processed as the last rule -- after all others have run -- i.e., postings from <list-x@list.example.org> end up being blackholed, but those posts from <bob@office.example.org> get through to the list.)
I suppose the modern way of setting processing order (at least for the person using the web-interface) is not to define "numbers" in the interface, but to allow the user/admin/moderator to move things up and down with arrows (so replace '2' with '↑' and '↓', and something "appropriate" for 'top' and 'bottom' of the list), and perhaps enabling mouse click-and-drag?
Was that in the pipeline?
-- ``Another sport which wastes unlimited time is Comma-hunting.'' (Francis Cornford, Microcosmographia Academica)
On Jun 08, 2010, at 01:55 PM, Adam McGreggor wrote:
The current "problem", is the order in which MM2 handles its non-members filters; and I guess what I'd welcome is an ability to finely control the order in which given rules are processed; I think that would help immensely.
Here's how Mailman 3 works.
First, where MM2 had 'handlers' which conflated rule checking with message processing, MM3 separates these. This means that the processing handlers are called during a separate phase of message delivery, and not until the message has been approved for delivery.
Rule checking itself happens by way of configurable 'chains'. There are a number of built-in chains, but you can always add new ones and you can configure mailing lists, or the entire MM3 system to use your custom defined chains. Each chain consists of a series of 'links' where each link is essentially a triplet of (rule, action, argument).
Rules are just the name of the rule, so custom rules must have unique names, but they can be more or less arbitrary strings (similarly with chain names). Rules are looked up globally by name. Link actions are one of the following:
- jump - stop processing this chain and start processing from the beginning of the named chain; takes a chain name as argument
- stop - stop processing through this chain
- defer - make no decision (i.e. continue processing through the current chain)
- run - the argument will be a callable, so call it with the standard argument triple of (mailing list, message, message metadata dictionary)
- detour - this is like 'jump' except that processing returns to the next link in the original chain when the detour chain is finished; takes a chain name as argument
(chain processing loop is in mailman/core/chains.py)
From here on I'll talk about what happens by default...
The incoming queue runner is now very simple. It asks the mailing list for its 'start chain' and then processes the message through that chain. By default, this is the 'built-in' chain.
(built-in chain is defined in mailman/chains/builtin.py)
The built-in chain starts by running a few immediate actions:
- is the message pre-approved? if so, jump to the 'accept' chain
- is the mailing list in emergency hold? if so, jump to the 'hold' chain
- are we in a mailing list loop? if so, jump to the 'discard' chain
After this, some of the general checking rules get processed, but they all defer action. These are rules like the administrivia check, no-subject check, member moderation rule, and so on. Each of these rules marks the message metadata with a 'hit' or 'miss' tag. After these run, the 'any' rule runs and it just looks to see if there were any rule hits. If so, we jump to the 'hold' chain.
If a message makes it through this gauntlet, it then detours through a dynamically created 'header-match' chain. This chain is created the configuration file, so it works globally. This means you can define your global header matches and decide which will be accepted, held, rejected, or discarded, say to handle known spammers. While not currently implemented, a similar technique will be used to do per-list web-configured header matching. It should be fairly straight-forward to implement your request using the above raw materials, and we should definitely do that.
Just to finish the story, the final action in the built-in chain is to accept the message unconditionally. I.e. it's made it through all the known checks, so it should be good to go.
As you've seen above, there are other default chains, such as discard, reject, hold, and accept. Most of these are fairly simple, e.g. the discard chain just logs the Message-ID, fires an event, and then does nothing, which basically throws the message away. The accept chain sets up a couple of headers, logs the Message-ID, fires an event, and drops the message in the 'accept' queue - which is where the processing queue runner does all the other message preparation tasks you're familiar with from MM2. The hold chain is of course the most complex one; for more details UTSL.
I've probably given you way too much detail, and this should definitely go into a system architecture document, but hopefully it gives you an idea of the power and flexibility of MM3.
-Barry
On Tue, Jun 8, 2010 at 00:13, Barry Warsaw <barry@list.org> wrote:
We can try to make it more difficult to harvest email address from mailing list archives and posts, but some of that is fairly difficult without disrupting the usability of the mailing list.
I had two suggestions or should I say, feature requests(?) :
Obfuscate the *-owner address on the listinfo page :: Allow the admin to edit the MM-footer in such a way that a spammer cannot use bots to click and spam the *-owner address. ATM, this customization is possible only if you have access to the MM installation--most admins dont. Currently, MM allows me to edit the "General list information page" and remove the MM-footer but in a Floss project, folks need to know who the list admins inorder to get in touch with them (I can put the list admin names on the listinfo page but how many people will read it?) and then, there is the possibility of assuming "cabal".
Obfuscate/trim Email-id's in the archives:: Currently an email-id is archived as "YourName at gmail.com" by MM. It would be nicer if it was obfuscated to ""Your..... at gmail.com" like google groups does: http://groups.google.com/groups/profile?hl=en&enc_user=Fvdg4xEAAAD4rsRkC93N5ixKcUO4kQ32kdEasx1kiYTQavV7mdW13Q and , http://groups.google.com/group/google-summer-of-code-announce/browse_thread/...
Thanks,
thanks and regards, vid || http://svaksha.com
On 6/8/2010 10:30 AM, स्वक्ष wrote:
Currently, MM allows me to edit the "General list information page" and remove the MM-footer but in a Floss project, folks need to know who the list admins inorder to get in touch with them (I can put the list admin names on the listinfo page but how many people will read it?) and then, there is the possibility of assuming "cabal".
Or you can create your own replacement footer and add its HTML to the page. Granted, this is a maintenance problem if you actually want to use the 'owner' addresses and they change, but you could just direct the mailto: to the -owner address.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On Jun 08, 2010, at 11:15 PM, स्वक्ष wrote:
- Obfuscate the *-owner address on the listinfo page
I think we should do something about this in MM3. Instead of displaying the individual owner addresses, we should advertise the -owner address and send the message through the normal Mailman processing. That way, if anti-spam defenses were implemented in the toolchain (either in the MTA or as an add-on to Mailman), you'd at least have a hope of catching them.
- Obfuscate/trim Email-id's in the archives:: Currently an email-id is archived as "YourName at gmail.com" by MM. It would be nicer if it was obfuscated to ""Your..... at gmail.com" like google groups does: http://groups.google.com/groups/profile?hl=en&enc_user=Fvdg4xEAAAD4rsRkC93N5ixKcUO4kQ32kdEasx1kiYTQavV7mdW13Q and , http://groups.google.com/group/google-summer-of-code-announce/browse_thread/...
Interesting that you can get the full email address after a captcha dance.
-Barry
On Mon, Jun 07, 2010 at 02:28:22PM -0400, Barry Warsaw wrote:
We can try to make it more difficult to harvest email address from mailing list archives and posts, but some of that is fairly difficult without disrupting the usability of the mailing list.
Agreed, and as I pointed out last year, it's useless. Spammers have such an embarrassment of riches when it comes to harvesting addresses that they really don't need to bother with mailing list mechanisms. And since I wrote that lengthy explanation, they've come up with a few more, including one that's really quite clever since it uses social engineering to convince users to give up not just addresses, but information on the relationships between them.
So there is not only zero value in trying to obfuscate addresses, there is *negative* value since the only people who will actually be impaired in the least by this are those actually trying to communicate, e.g., those coming across a message in an archive and trying to write to the author. Spammers are already so far past this that it's disappeared from their rear view mirrors.
---Rsk
On Jun 08, 2010, at 10:22 PM, Rich Kulawiec wrote:
On Mon, Jun 07, 2010 at 02:28:22PM -0400, Barry Warsaw wrote:
We can try to make it more difficult to harvest email address from mailing list archives and posts, but some of that is fairly difficult without disrupting the usability of the mailing list.
Agreed, and as I pointed out last year, it's useless. Spammers have such an embarrassment of riches when it comes to harvesting addresses that they really don't need to bother with mailing list mechanisms. And since I wrote that lengthy explanation, they've come up with a few more, including one that's really quite clever since it uses social engineering to convince users to give up not just addresses, but information on the relationships between them.
So there is not only zero value in trying to obfuscate addresses, there is *negative* value since the only people who will actually be impaired in the least by this are those actually trying to communicate, e.g., those coming across a message in an archive and trying to write to the author. Spammers are already so far past this that it's disappeared from their rear view mirrors.
Agreed. The least we can do is make mailing lists and their archives so much more valuable that it's worth the cost of doing business.
-Barry
participants (6)
-
Adam McGreggor
-
Barry Warsaw
-
Julian Mehnle
-
Mark Sapiro
-
Rich Kulawiec
-
स्वक्ष