[Mailman-Developers] Opening up a few can o' worms here...

Chuq Von Rospach chuqui@plaidworks.com
Tue, 16 Jul 2002 10:58:00 -0700

Howdy, ya'll. Back from vacation, and opening up a few can o' worms (gotta
big can opener here!) for everyone to chew on.... Apologize if this goes all
over the map, a bunch of things have been simmering and I want to pass them
along to get your perspective, see if they are issues that ought to be
considered in Mailman somehow, and perhaps warn you of stuff I'm seeing
before it hits you...

First, a minor announcement. I'm no longer in charge of the mailing lists at
apple, sort of. We've hired a person full-time, and he's been taking over
the lists server as his full-time responsibility, allowing me to go off and
work on other projects. I'm still in the loop, just not "it". I'm still
going to be heavily involved as we move that box to Mailman 2.1, and after
that, probably fade a bit more into the woodwork (I still run my Mailman box
at home, however, so I'm not going away. JC, quite jeering)

Anyway, on to issues more substantive.

With the explosion of spam, privacy issues and protection issues have been
of great interest around here, including fascinating (but not family
friendly) discussions with our legal beagles.

The end result of all of that is a decision that subscriber e-mail addresses
are considered a significant asset that needs to be protected to the
greatest extent possible. We're currently evaluating exactly what that means
(and it means the list rules/T&Cs are going to need to be revamped as well,
in ways we're still figuring out) to do what we can to make sure people who
post to the mailing lists don't get harvested.

One thing we're definitely doing is moving to a cloaked archive. Since we
already distribute all archives out of HTTP, not FTP, we're working on a CGI
that'll strip all e-mail information out of messages on the fly (among other
things, like header cleanup and some trivial formatting fixes). The idea is
simple -- we've finally hit the point where you can't put an e-mail address
up on a public site under any cirucmstance safely, so we're having to move
to a system where we simply don't do that.

I think the Mailman stuff needs  to think about this, also. It impacts the
archiving setup and other issues, but the harvesters have hit the point
where we simply can't risk disclosing that info. It creates other problems
-- you can't see a posting in the archive and send email to that person with
more questions (or answers), but that seems trivial compared to the problems
the spammers are causing.

A secondary issue here is the problem of disclosing admins and admin
addresses. I know we've hashed that through once, but we've come to the
(somewhat reluctant) decision to whitelist all public, non-personal email
addresses. We're going to be implementing TMDA to do this, and will be
switching all admin to generic addresses that filter through TMDA, as well
as things like postmaster@ and the like. While I hate making users jump
through hoops to get through to a real person (for those that don't know,
TMDA is an overt whitelist. If you're not on the whitelist, you get mail
back telling you to take some action, and until you do, the mail isn't
delivered), but the abuse by the spammers on admin addresses is now so bad
I'm declaring defeat and going to the whitelist.

I'm going to look and see if I can interface TMDA to the subscriber
databases so that subscribers are by definition whitelisted, but we've hit
the poiint where we have to do this. I'm not happy about it, but the war is
lost, I think.

And speaking of privacy, harvesting and spamming, a new and disturbing thing
happened this weekend that I want to bring up -- one for which I have lots
of questions, but no real answers. A bunch of users on some of our mail
lists were spammed, and it became very clear very quickly that addresses
were harvested off of at least one of our mail lists.

As you might guess, a lynch mob formed, and I lit the first virtual torch
and we all sharpened the pitchforks. Fortunately, the person who did it came
forward to me and admitted guilt, and explained what happened.

And what happened is pretty damn disturbing. See, he had one of those "I
must tell the masses!" moments, where he finally felt it was time to send
out a call to arms on a subject he felt strongly about.

So what he did was open up his address book and send his message to everyone
in it. And he's running one of these new e-mail clients that happily caches
addresses it sees in case you want them again. So all of the addresses of
people posting to the mailing lists he subscribed to were in his address
book cache, so when he grabbed his address book, he grabbed all of those
addresses, too.

So we have a clear violation of our anti-harvesting rules -- yet he didn't
overtly harvest. He just grabbed what was in his address book at the time.

This creates a major privacy quagmire. How do you set up rules for something
like that? Where does ownership and protection end? (I'm talking ethically,
not technically. I think we all realize that once someone posts email to a
list, you've given up control to anyone who doesn't feel obligated to follow
the rules). This wasn't a case of overtly violating the rules, but of a
piece of technology creating a situation where it wasn't understood there
were rules being violated.

I just don't know how to deal with the issues this address caching causes.
Ultimately, we're going to have to rethink our "no harvesting" rules, and
likely also write disclaimers explaining what our limits are. We've actually
considered switching our lists to obscured addresses, turned that down as
being worse than the disease (for now). But now we're wondering if we have
to go to some sort of address cloaking ON lists, maybe some kind of address
remapping through the server for replies, something. And I'm gritting my
teeth at the developers who created those @#$@$#@$#23 caches (which are nice
in some ways) for not also creating some way to flag addresses as not
cacheable. Because, IMHO, that'd solve this problem.

But they didn't. Grumble.

I'm curious what people think about this latest thing. The good news is he
wasn't trying to harvest us. The bad news is, he wasn't trying to harvest
us. And the b-tch of it is, I really don't have a comfortable feeling for
how to deal with this new situation yet... But I think it's an issue we have
to come to grips with.

Are we hitting a point where mail list servers have to act as blind front
ends for all of the subscribers, where replies are processed by those
servers, and the server then takes on the job of acting as a
troll-exterminator and spam blocker? And what does that really mean for
things like Mailman?

Happy Macworld Expo week, all. If you need me, I'll be in the war room,
beating my head against a wall.

Chuq Von Rospach, Architech
chuqui@plaidworks.com -- http://www.chuqui.com/

He doesn't have ulcers, but he's a carrier.