[Mailman-Developers] Opening up a few can o' worms here...

J C Lawrence claw@kanga.nu
Tue, 16 Jul 2002 21:36:52 -0700


On Tue, 16 Jul 2002 10:58:00 -0700 
Chuq Von Rospach <chuqui@plaidworks.com> wrote:

> First, a minor announcement. I'm no longer in charge of the mailing
> lists at apple, sort of. We've hired a person full-time, and he's been
> taking over the lists server as his full-time responsibility, allowing
> me to go off and work on other projects. I'm still in the loop, just
> not "it". I'm still going to be heavily involved as we move that box
> to Mailman 2.1, and after that, probably fade a bit more into the
> woodwork (I still run my Mailman box at home, however, so I'm not
> going away. JC, quite jeering)

<sob>

> ... to do what we can to make sure people who post to the mailing
> lists don't get harvested.

You realise that subscribed robots on lists are going to be the next
SPAMmer trick?  Of course they're easily defeated with an enforced
moderated first post, but 'net-wide that's going to be a painful
evolution.

> A secondary issue here is the problem of disclosing admins and admin
> addresses. I know we've hashed that through once, but we've come to
> the (somewhat reluctant) decision to whitelist all public,
> non-personal email addresses. We're going to be implementing TMDA to
> do this...

Actually it got very little discussion beyond the commentary that
many/most of the valid mail to -owner and -admin is from member's
non-subscribed addresses (even more likely given plus addressing).  TMDA
seems to work well tho for an -owner and -admin filter.

> ... and will be switching all admin to generic addresses that filter
> through TMDA, as well as things like postmaster@ and the like.

Good idea.  I've not moved over postmaster and webmaster.

> I'm going to look and see if I can interface TMDA to the subscriber
> databases so that subscribers are by definition whitelisted...

It already can plug into Mailman's member roster in config.db.
Generically extending that to plug, say, into an externally abstracted
interface shouldn't be too difficult.  What I've been poking at here is
having TMDA call an external tool, passing it the address to be verified
along with the To: address.  The external tool would then do whatever
(eg SQL query, LDAP lookup, poke into config.db, etc) and exit with an
appropriate return code.

Should work fairly well, especially for those cases where the
authentication data is not local.

> ... but we've hit the poiint where we have to do this. I'm not happy
> about it, but the war is lost, I think.

Or even close to being over.

> And speaking of privacy, harvesting and spamming, a new and disturbing
> thing happened this weekend that I want to bring up -- one for which I
> have lots of questions, but no real answers. 
...
> So what he did was open up his address book and send his message to
> everyone in it. And he's running one of these new e-mail clients that
> happily caches addresses it sees in case you want them again. So all
> of the addresses of people posting to the mailing lists he subscribed
> to were in his address book cache, so when he grabbed his address
> book, he grabbed all of those addresses, too.

Yeah, I've seen this happen a couple times now.  Typically it gets made
into a public decapitation and all onlookers learn the DO NOT DO THAT
lesson (or at least seem to).

> So we have a clear violation of our anti-harvesting rules -- yet he
> didn't overtly harvest. He just grabbed what was in his address book
> at the time.

Note: Many MUAs do this sort of address collection outside of Outlook.
Heck, even the exmh I use can do collect addresses that way (and does by
default IIRC)

> This creates a major privacy quagmire. How do you set up rules for
> something like that? 

You don't.  You can't stop it technically so you have to rely on social
engineering and a public pillory with frequent enough public
humiliations that it starts entering the public unconscious.

> Where does ownership and protection end?

I draw the line at the edge of my MTA.  Once you send a message to one
my lists and it gets broadcast all rules and bets are off.  I'll take
reasonable care and due diligence within my server, but outside of that
I don't attempt to control any more than I attempt to control the
persistently rude (whom I normally unsubscribe, but that's another
matter).

> I just don't know how to deal with the issues this address caching
> causes.  

I'm firmly of the mind that we can't.  Its intrinsically an education
issue.  Now we can be effective in helping educate, but until is enters
the social consciousness we'll be fighting up hill.

> Ultimately, we're going to have to rethink our "no harvesting" rules,
> and likely also write disclaimers explaining what our limits are.

<nod>

One possible technical approach is the mask all From: and CC: addresses
on messages broadcast from a list with date limited plus addresses which
reverse map on the list server back to the original address.  Its no
sort of permanent solution, it doesn't handle addresses quoted in .sigs,
message bodies, or other odd/arbitrary headers, but it can do a whole
lot to cripple address harvesting.

<ponder>

Actually, I rather like this idea for certain lists.

> We've actually considered switching our lists to obscured addresses,
> turned that down as being worse than the disease (for now). 

You mean like the above?

> But now we're wondering if we have to go to some sort of address
> cloaking ON lists, maybe some kind of address remapping through the
> server for replies, something. And I'm gritting my teeth at the
> developers who created those @#$@$#@$#23 caches (which are nice in
> some ways) for not also creating some way to flag addresses as not
> cacheable. Because, IMHO, that'd solve this problem.

Note that making a fixed and persistent address mapping really doesn't
handle anything.  You've just created another alias which works just as
well for the SPAMmers or address harvesters and you can't (server side)
distinguish between valid or SPAM mail sent to it any more successfully
than you can for normal mail.

  Well, unless you setup TMDA-style filters for every such mapped plus
  address, which, asides from the cache expense really isn't such a bad
  idea.  Hurm.  I kinda like it actually.

Date limiting the validity of the mapped address has two pleasant
effects: It limits the size of your database of addresses, and it limits
the window of opportunity for abuse of the address.

> But they didn't. Grumble.

Note that such address collection has been around a long time.  exmh is
by no means a particularly new or bleeding edge MUA and its had this
sort of address collection feature for as long as I've used it.

> I think it's an issue we have to come to grips with.

I only see two addresses:

  1) A pillory

  2) Dynamic and/or filtered address mapping as above.

Neither are particularly pleasant.  Address mapping breaks as soon as
Outlook (for instance) starts scanning message bodies for addresses to
cache.

> Are we hitting a point where mail list servers have to act as blind
> front ends for all of the subscribers, where replies are processed by
> those servers, and the server then takes on the job of acting as a
> troll-exterminator and spam blocker? 

Yes, not yet, but soon.

> And what does that really mean for things like Mailman?

It means that Mailman will have to have a plug-in layer to do the
appropriate processing.  

-- 
J C Lawrence                
---------(*)                Satan, oscillate my metallic sonatas. 
claw@kanga.nu               He lived as a devil, eh?		  
http://www.kanga.nu/~claw/  Evil is a name of a foeman, as I live.