Mailman 3 RE: [Mailman-Developers] Interesting study -- spam on postedaddresses... - Mailman-Developers

newer
changing question-or-comment-addr

RE: [Mailman-Developers] Interesting study -- spam on postedaddresses...

older
Can't get VERP to do much

Dale Newfield

20 Feb 2002 20 Feb '02

8:15 p.m.

On Wed, 20 Feb 2002, Damien Morton wrote:

...

I still think the email-address-as-jpeg solution is prohibitively expensive to reverse; effectively impossible for machines, entirely easy for people.

But it does have drawbacks.

It only works with graphical browsers.

It can't be enlarged for people that have poor vision.

It can be reverse-engineered -- all they have to do is decode a single font, then they're all simple to snag.

In fact, as someone with lots of computer graphics experience, I'd say it would be almost no harder to write code to decode them than it would be to write code to generate them.

...

Web Forms for contacting the admin cold. If the admin replies, you can continue the conversation via email.

Right, assuming the web form doesn't break.

...

Private and Public views of the archives.

Private archives are restricted to list members and those that can pass a reverse turing test.

People keep using this term, but I'm not sure what they mean, or if I trust that they'd be so reliable...

...

Public archives render all email addresses as jpegs.

If they're automatically generated, it'd be easier to create pngs or gifs, or lots of other formats than jpgs. Think about this, though--how do you actually generate the images and serve them properly without either including the email address in the html code anyway (so the img request specifies what image to generate), or building a whole database mapping arbitrary numbers to email addresses (so they can either be generated on the fly or stored pre-generated). Once you've got that database, why not just have that database front a web form instead of displaying the address?

-Dale

Show replies by thread

Chuq Von Rospach

20 Feb 20 Feb

8:48 p.m.

New subject: Interesting study -- spam on postedaddresses...

...

...
I still think the email-address-as-jpeg solution is prohibitively expensive to reverse; effectively impossible for machines, entirely easy for people.

But it does have drawbacks.

It only works with graphical browsers.

This is a very good point. I mentioned ADA compliance yesterday. To be ADA compliant, if you rendered the e-mail address as a graphic, you'd also have to put the text into the ALT tag. Which would enable it for lynx and sight-limited solutions -- and make putting into a graphic kinda meaningless. So you can't use this approach unless you want to ignore the ADA and lock out your blind users from those functions.

I'm not willing to make that tradeoff. While I'm not going to live or die on the ADA compliance issue, I think it's important to keep it in mind because it forces us to focus on more than the "easy" case or the "geek" case and worry about solutions that work across the spectrum of users, from the AOL newbie to Jay. We can't solve problems just for Jay, or just for Newbies, we have to find a solution that works as well as possible for as many of those groups as possible. ADA compliance is a useful strawman that keeps us focussed away from "I want it this way, so that's the right way".

-- Chuq Von Rospach, Architech chuqui@plaidworks.com -- http://www.chuqui.com/

He doesn't have ulcers, but he's a carrier.

John Morton

8:55 p.m.

New subject: Interesting study -- spam on postedaddresses...

On Thursday 21 February 2002 17:15, Dale Newfield wrote:

...

On Wed, 20 Feb 2002, Damien Morton wrote:

...

...
Web Forms for contacting the admin cold. If the admin replies, you can continue the conversation via email.

Right, assuming the web form doesn't break.

Monitor the form. Your monitoring tools should be telling you when bits of your site break before users have a need to report the problem.

...

...
Private and Public views of the archives.

Private archives are restricted to list members and those that can pass a reverse turing test.

People keep using this term, but I'm not sure what they mean, or if I trust that they'd be so reliable...

It's a test to find out if the agent that requested the page is human or some bot of some sort. In order to progress past the form you have to enter something into the box as a reply to some text in the form. If the question and answer can be arbitary on a site by site, or better, hit by hit basis, then it becomes infeasible to build a spambot to enter such sites.

...

...
Public archives render all email addresses as jpegs.

If they're automatically generated, it'd be easier to create pngs or gifs, or lots of other formats than jpgs. Think about this, though--how do you actually generate the images and serve them properly without either including the email address in the html code anyway (so the img request specifies what image to generate), or building a whole database mapping arbitrary numbers to email addresses (so they can either be generated on the fly or stored pre-generated).

I'd pregenrate them, give them an arbitary name and store a dictionary mapping email addresses to the image for page building purposes.

...

Once you've got that database, why not just have that database front a web form instead of displaying the address?

I'm not sure what you mean by this. Can you explain?

(Not that I think image addreses are a good idea for all the reasons you mentioned earlier. I'd prefer a slashdot style per user 'display address' option. It can be obfuscated by default, but it allows the user to restore there actual address, or render it unrecognizable depending on there personal spam tolerance threshold.)

John

Dale Newfield

9:08 p.m.

New subject: Interesting study -- spam on postedaddresses...

On Thu, 21 Feb 2002, John Morton wrote:

...

It's a test to find out if the agent that requested the page is human or some bot of some sort.

Assuming you can build such a test. Good luck.

...

If the question and answer can be arbitary on a site by site, or better, hit by hit basis, then it becomes infeasible to build a spambot to enter such sites.

If it's arbitrary, it's generated by some algorithm. If it's generated by some algorithm, I just need to figure out the algorithm and I can always get it.

...

I'd pregenrate them, give them an arbitary name and store a dictionary mapping email addresses to the image for page building purposes.

...
Once you've got that database, why not just have that database front a web form instead of displaying the address?

I'm not sure what you mean by this. Can you explain?

If you've got a database mapping arbitrary number/name/string to an email address, then why not just have a web form that sends mail to that address knowing only the arbitrary value (and never divulge the email address)?

...

I'd prefer a slashdot style per user 'display address' option.

I don't believe any system like slashdot's is worth the time to implement, since it is just as easily broken, and now you've got more useless stuff for every single user to manage.

Dale Newfield <Dale@Newfield.org>

"To announce that there must be no criticism of the President, or that we are to stand by the President, right or wrong, is not only unpatriotic and servile, but is morally treasonable to the American public." -T. Roosevelt

Chuq Von Rospach

9:41 p.m.

New subject: Interesting study -- spam on postedaddresses...

...

...
It's a test to find out if the agent that requested the page is human or some bot of some sort.

Assuming you can build such a test. Good luck.

That some other programmer can't cheat on. Even gooder luck.

...

If it's arbitrary, it's generated by some algorithm. If it's generated by some algorithm, I just need to figure out the algorithm and I can always get it.

There is some validity to the "the club" mentality, of "we don't have to fix it, we only have ot make it difficult enough to convince them to annoy someone else". But if we assume we're building the New Defacto Standard Listserver for the Internet here with mailman, that strategy fails, because if we succeed, then it becomes worth their time to find the anti-Club. Security by obscurity only works if you're really obscure, which implies failure of the software to thrive. I'm not interested in that (and even then, you aren't guaranteed success by being obscure).

Another way of looking at it is "I don't have to outrun the lion. I only have to outrun you" -- but that doesn't work if the lion is infinitely hungry and doesn't get tire.d Which defines a spambot.

I'm more and more ocnvinced the answer is simply putting admins behind a web form, and telling site admins to publicize an emergency address (like postmaster), and putting up a watcher on the system to set off alarms when it breaks.

...

If you've got a database mapping arbitrary number/name/string to an email address, then why not just have a web form that sends mail to that address knowing only the arbitrary value (and never divulge the email address)?

Basically, what I'm proposing. And I'm more and more convinced it's the right way to do this, for all that web forms are less personal than sending email directly. I think admins have to make themselves accessible. I don't think they have to make themselves accessible on the user's terms... Another of those tradeoffs.

-- Chuq Von Rospach, Architech chuqui@plaidworks.com -- http://www.chuqui.com/

The first rule of holes: If you are in one, stop digging.

John Morton

9:44 p.m.

New subject: Interesting study -- spam on postedaddresses...

On Thursday 21 February 2002 18:08, Dale Newfield wrote:

...

On Thu, 21 Feb 2002, John Morton wrote:

...
It's a test to find out if the agent that requested the page is human or some bot of some sort.

Assuming you can build such a test. Good luck.

Building a good one is tricky. It depends on your model of the attacker, and while I've seen some wild speculation of the capabilities of email address harvesters, I don't have any hard facts about the cost/benifit equations they use.

...

...
If the question and answer can be arbitary on a site by site, or better, hit by hit basis, then it becomes infeasible to build a spambot to enter such sites.

If it's arbitrary, it's generated by some algorithm. If it's generated by some algorithm, I just need to figure out the algorithm and I can always get it.

Arbitary as in 'doesn't have to be fixed'. Allowing the site admin the ability to build there own set wouldn't have to involve an algorithm (though I'm spliting hairs, really; I don't think this is a workable idea, either).

...

...
I'd pregenrate them, give them an arbitary name and store a dictionary mapping email addresses to the image for page building purposes.

...
Once you've got that database, why not just have that database front a web form instead of displaying the address?

I'm not sure what you mean by this. Can you explain?

If you've got a database mapping arbitrary number/name/string to an email address, then why not just have a web form that sends mail to that address knowing only the arbitrary value (and never divulge the email address)?

"What if the form breaks down?" :-)

Actually, the reason not to use it is that it can be used to spam anyone who's id mapping you can grab from the archive!

...

...
I'd prefer a slashdot style per user 'display address' option.

I don't believe any system like slashdot's is worth the time to implement, since it is just as easily broken, and now you've got more useless stuff for every single user to manage.

You've got three statements here, I'll address them one at a time:

'I don't believe any system like slashdot's is worth the time to implement'

How hard is it, really. All we're looking at is adding an extra field to the each member record, to the forms for managing user settings, a method to generate a default obfuscation and anther one to substitute addresses in the archive.

'since it is just as easily broken'

I never bothered to obfuscate my address, and while I seldom post, I hardly ever recieve spam either (and my address is attached to all sorts of things that are more likely to be harvested). The best we can do is come up with some 'good enough' solutions, and one that offers a user the opportunity to have their address displayed as 'no spam please' is about the best I can think of.

Rather than have the whole list exhale great gouts of hot air about what obfuscation methods are broken or not, why don't we do an experiment? Someone should sign up for a couple of email addresses at a free mail service, subscribe to slashdot and post to several stories with each over the month. One account can use their raw email address in each posting, and the other can use some obfuscation method. Then, as the weeks tick by, we can actually see just how useless, or otherwise, obfuscation really is.

'and now you've got more useless stuff for every single user to manage.'

If 16 million people can operate the Hotmail UI, I think mailman list users can handle another text field. Especially if it's already filled out for them.

John

John Morton

10:10 p.m.

New subject: Interesting study -- spam on postedaddresses...

On Thursday 21 February 2002 18:41, Chuq Von Rospach wrote:

...

There is some validity to the "the club" mentality, of "we don't have to fix it, we only have ot make it difficult enough to convince them to annoy someone else". But if we assume we're building the New Defacto Standard Listserver for the Internet here with mailman, that strategy fails, because if we succeed, then it becomes worth their time to find the anti-Club. Security by obscurity only works if you're really obscure, which implies failure of the software to thrive. I'm not interested in that (and even then, you aren't guaranteed success by being obscure).

Another way of looking at it is "I don't have to outrun the lion. I only have to outrun you" -- but that doesn't work if the lion is infinitely hungry and doesn't get tire.d Which defines a spambot.

Indeed. Email addresses aren't secrets - even more so than credit card numbers and biometric data - and the attackers have more than enough resources to seek them out.

...

I'm more and more ocnvinced the answer is simply putting admins behind a web form, and telling site admins to publicize an emergency address (like postmaster), and putting up a watcher on the system to set off alarms when it breaks.

For the admin addresses, I'd agree with you. Building up a document of pointers to good spam filtering tools couldn't hurt either.

For archives that aren't behind a login, I think slashdot style obfuscation is the best we can do. The list admin can pick the default obfuscation scheme or none at all. Users who want there email address out there for whatever reason can do so, and rely on their 'War and Peace' spam filters to keep the noise down, while other folks can opt in even further and replace the obfuscated email address with some useless string.

John

John W Baxter

10:14 p.m.

New subject: Interesting study -- spam on postedaddresses...

At 0:08 -0500 2/21/2002, Dale Newfield wrote:

...

...
If the question and answer can be arbitary on a site by site, or better, hit by hit basis, then it becomes infeasible to build a spambot to enter such sites.

If it's arbitrary, it's generated by some algorithm. If it's generated by some algorithm, I just need to figure out the algorithm and I can always get it.

Not to mention that people surprisingly often mess up answers to questions as easy as "Who is buried in Grant's Tomb?"

--John

-- John Baxter jwblist@olympus.net Port Ludlow, WA, USA Who is buried in Grant's Pass? Many people who lived there, and some who had moved away.

Dale Newfield

10:15 p.m.

New subject: Interesting study -- spam on postedaddresses...

On Wed, 20 Feb 2002, Chuq Von Rospach wrote:

...

...
If you've got a database mapping arbitrary number/name/string to an email address, then why not just have a web form that sends mail to that address knowing only the arbitrary value (and never divulge the email address)?

Basically, what I'm proposing. And I'm more and more convinced it's the right way to do this, for all that web forms are less personal than sending email directly. I think admins have to make themselves accessible. I don't think they have to make themselves accessible on the user's terms... Another of those tradeoffs.

I was meaning the archives. We have a database describing lists, so we can handle that.

-Dale

Dale Newfield

10:25 p.m.

New subject: Interesting study -- spam on postedaddresses...

On Thu, 21 Feb 2002, John Morton wrote:

...

Actually, the reason not to use it is that it can be used to spam anyone who's id mapping you can grab from the archive!

That's a separate issue and can have a separate solution. Make the form smart--for example, make it only accept 10 messages from a single IP address in a single day.

If we want/expect Maimlan to succeed, then there will be enough incentive for someone to break the obfuscation mechanism. Are you suggesting we restrict access to part of Mailman's source code? Are you suggesting that with the source I can't reverse-engineer every obfuscation (as opposed to information removing) system you try? Why add more points of failure into a system if they don't gain you anything?

Basically it looks to me like there ultimately can be no successful obfuscation technique. Why not instead simply remove the information and ONLY provide web-forms? (Again, I'm talking only about archives--I think at least some mailto: is required in case of systemic failures.)

-Dale

John W Baxter

10:26 p.m.

New subject: Interesting study -- spam on postedaddresses...

At 23:15 -0500 2/20/2002, Dale Newfield wrote:

...

On Wed, 20 Feb 2002, Damien Morton wrote:

...
I still think the email-address-as-jpeg solution is prohibitively expensive to reverse; effectively impossible for machines, entirely easy for people.

...

It can't be enlarged for people that have poor vision.

Opera is a counter-example, but doesn't defeat your point.

(Tidbits the cat enlarged the web page my Windows laptop was idling on to 200% earlier today in a walk across the keyboard: Opera has keystrokes for nearly everything.)

--John

-- John Baxter jwblist@olympus.net Port Ludlow, WA, USA

Damien Morton

21 Feb 21 Feb

5:28 a.m.

New subject: Interesting study -- spam on postedaddresses...

...

From: Dale Newfield [mailto:dale@newfield.org]

On Wed, 20 Feb 2002, Damien Morton wrote:

...
I still think the email-address-as-jpeg solution is prohibitively expensive to reverse; effectively impossible for machines, entirely easy for people.

But it does have drawbacks.

It only works with graphical browsers.

This is true. We are in the 21st century now. Expecting a graphical client isnt such a huge leap of faith, unless we allow ourselves to be guided by recidivist or luddite lynx users and their ilk.

...

It can't be enlarged for people that have poor vision.

This is true, for the public archives.

...

It can be reverse-engineered -- all they have to do is decode a single font, then they're all simple to snag.

Assuming you use a single font. Assuming you don't add some noise to the resulting image. Assuming you don't do some geometric distortion to the resulting image.

To reverse engineer, a harvester would have to examine pretty much every image it finds, OCR it with some fantastic military grade image recognition software, and see if theres an email address buried in there.

As I said, "prohibitively expensive to reverse"

...

In fact, as someone with lots of computer graphics experience, I'd say it would be almost no harder to write code to decode them than it would be to write code to generate them.

As someone with lots of computer graphics experience, you will probably know that OCR is hard. Its even harder with a non-cooperating document, hidden amongst many other documents.

...

...
Web Forms for contacting the admin cold. If the admin replies, you can continue the conversation via email.

Right, assuming the web form doesn't break.

In my experience, the mostly likely route to a web form breaking is if the email address it sends to breaks.

...

...
Private and Public views of the archives.

Private archives are restricted to list members and those that can pass a reverse turing test.

People keep using this term, but I'm not sure what they mean, or if I trust that they'd be so reliable...

Some examples of reverse turing tests. (http://www.captcha.net/) http://www.captcha.net/cgi-bin/bongo http://www.captcha.net/cgi-bin/stumpy http://drive.to/research (this one uses audio) http://www.captcha.net/gimpy.html

Any of those tests can be implemented in Python using PIL.

Between an audio test and a visual test, you've got the blind and the deaf covered.

...

...
Public archives render all email addresses as jpegs.

If they're automatically generated, it'd be easier to create pngs or gifs, or lots of other formats than jpgs. Think about this, though--how do you actually generate the images and serve them properly without either including the email address in the html code anyway (so the img request specifies what image to generate), or building a whole database mapping arbitrary numbers to email addresses (so they can either be generated on the fly or stored pre-generated). Once you've got that database, why not just have that database front a web form instead of displaying the address?

I suggested JPEGs because they are computationally more expensive to decode than other formats. Also: compression is lossy and adds a certain amount of noise to an image.

Generating and serving the images would be done as follows:

filename = md5.new('list-specific-salt-string'+'email@server.com').hexdigest() + '.jpg' if not exists(filename): img = render_email('email@server.com') img.save(filename)

Then you relace every occurrence of 'email@server.com' with '<img src="%s">' % filename

Replacing the email addresses with a link to a webform would be another, perfectly acceptable solution, assuming you can get over your own objections to web forms.

Nigel Metheringham

6:37 a.m.

New subject: Interesting study -- spam on postedaddresses...

On Thu, 2002-02-21 at 13:28, Damien Morton wrote:

...

...
From: Dale Newfield [mailto:dale@newfield.org] It only works with graphical browsers.

This is true. We are in the 21st century now. Expecting a graphical client isnt such a huge leap of faith, unless we allow ourselves to be guided by recidivist or luddite lynx users and their ilk.

You haven't been following this, have you...

Chuq Vos Rospach wrote yesterday in response to Dale's point:-

...

This is a very good point. I mentioned ADA compliance yesterday. To be ADA compliant, if you rendered the e-mail address as a graphic, you'd also have to put the text into the ALT tag. Which would enable it for lynx and sight-limited solutions -- and make putting into a graphic kinda meaningless. So you can't use this approach unless you want to ignore the ADA and lock out your blind users from those functions.

...

I'm not willing to make that tradeoff. While I'm not going to live or die on the ADA compliance issue, I think it's important to keep it in mind because it forces us to focus on more than the "easy" case or the "geek" case and worry about solutions that work across the spectrum of users, from the AOL newbie to Jay. We can't solve problems just for Jay, or just for Newbies, we have to find a solution that works as well as possible for as many of those groups as possible. ADA compliance is a useful strawman that keeps us focussed away from "I want it this way, so that's the right way".

plus enforcing a minimum browser standard (other than minimal text/html) is going to hit deep water with the various PDAs, phones, WAP and other stuff that almost has real browsers on.

And insulting lynx users isn't a way to increase your expected life span. Go do something less controversial like arguing the advantages of vi in the emacs news groups.

Nigel.

-- [ Nigel Metheringham Nigel.Metheringham@InTechnology.co.uk ] [ Phone: +44 1423 850000 Fax +44 1423 858866 ] [ - Comments in this message are my own and not ITO opinion/policy - ]

Damien Morton

7:27 a.m.

New subject: Interesting study -- spam on postedaddresses...

...

From: Nigel Metheringham

...
...
From: Dale Newfield [mailto:dale@newfield.org] It only works with graphical browsers.

This is true. We are in the 21st century now. Expecting a graphical client isnt such a huge leap of faith, unless we allow ourselves to be guided by recidivist or luddite lynx users and their ilk.

You haven't been following this, have you...

I have actually, although I do admit I probably stepped over the line there.

...

Chuq Vos Rospach wrote yesterday in response to Dale's point:-

...
This is a very good point. I mentioned ADA compliance yesterday. To be ADA compliant, if you rendered the e-mail address as a graphic, you'd also have to put the text into the ALT tag. Which would enable it for lynx and sight-limited solutions -- and make putting into a graphic kinda meaningless. So you can't use this approach unless you want to ignore the ADA and lock out your blind users from those functions.

Chuq, you wouldn't have to do this if it rendered the purpose of emails-as-jpegs invalid.

...

...
I'm not willing to make that tradeoff. While I'm not going to live or die on the ADA compliance issue, I think it's important to keep it in mind because it forces us to focus on more than the "easy" case or the "geek" case and worry about solutions that work across the spectrum of users, from the AOL newbie to Jay. We can't solve problems just for Jay, or just for Newbies, we have to find a solution that works as well as possible for as many of those groups as possible. ADA compliance is a useful strawman that keeps us focussed away from "I want it this way, so that's the right way".

plus enforcing a minimum browser standard (other than minimal text/html) is going to hit deep water with the various PDAs, phones, WAP and other stuff that almost has real browsers on.

I have been proposing the use of advanced browser standards for the 'public' archives, i.e. those archives that are accessible to the world, be it users, email harvesters, or the spiders of search engines. Making a private archive available to those who are list members or who are willing to authenticate themselves as human, and making the private archive plain unobfuscated text should mean that everyone is at least able to get what they need, if only after jumping through some hoops.

Those hoops could be a visual test, an audio test, or a list membership test (which depends on having provided a valid email address).

Further, the public archive would differ from the private archive only by the obfuscation of email addresses. That would be the only difference.

I wonder if the ADA would accept the need to obscure email addresses, and I wonder if they would accept the extra authentication step required to get at the unobscured email address? Would they understand that it protects all mailman users, including the disabled?

Would Lynx users and other browser-disadvantaged users accept the extra authentication/authorisation step to get at the unobscured email addresses? Would they understand that it protects _them_ as well?

...

And insulting lynx users isn't a way to increase your expected life span. Go do something less controversial like arguing the advantages of vi in the emacs news groups.

Agreed, appologies to recidivists, luddites and lynx users :)

Dale Newfield

8:20 a.m.

New subject: Interesting study -- spam on postedaddresses...

On Thu, 21 Feb 2002, Damien Morton wrote:

...

OCR is hard

OCR is hard mostly because of the analog components (and the variety of fonts that exist). If you are generating the image digitally (and with a limited set of fonts), most of the OCR problems go away.

...

Some examples of reverse turing tests. (http://www.captcha.net/)

It appears that each of those introduces non ADA compliant aspects. The first and third can be defeated with a database no larger than that needed to implement it, the third is unlikely to work on many platforms (audio dependancies kept it from working for me), and the fourth I couldn't even figure out as a human--not what we're looking for.

...

Between an audio test and a visual test, you've got the blind and the deaf covered.

And you've introduced lots of browser/platform dependancies that mean you can't use new low-bandwidth platforms, like WAP.

-Dale

Dale Newfield

8:28 a.m.

New subject: Interesting study -- spam on postedaddresses...

On Thu, 21 Feb 2002, Damien Morton wrote:

...

Making a private archive available to those who are list members

I haven't commented on this before, but the reason I find this solution lacking is that most mailman lists (in my experience) don't require list admin permission to join. If this is the hurdle, as a spammer I'd just create a hotmail account that I can automatically subscribe to any mailman mailing list, and then gain access to the honeypot.

-Dale

Chuq Von Rospach

9:01 a.m.

New subject: Interesting study -- spam on postedaddresses...

On 2/21/02 5:28 AM, "Damien Morton" <dm-temp-310102@nyc.rr.com> wrote:

...

...
It only works with graphical browsers.

This is true. We are in the 21st century now. Expecting a graphical client isnt such a huge leap of faith, unless we allow ourselves to be guided by recidivist or luddite lynx users and their ilk.

Or those who are sight-limited and using talker browsers, but the heck with the blind and aged, we don't need them anyway.

Or those on wireless browsers, like cell phones, PDA's, blackberries and the like. And those can be ignored, they're merely the fastest growing segment of the industry.

(grin)

-- Chuq Von Rospach, Architech chuqui@plaidworks.com -- http://www.chuqui.com/

No! No! Dead girl, OFF the table!

Damien Morton

9:05 a.m.

New subject: Interesting study -- spam on postedaddresses...

...

From: Dale Newfield

On Thu, 21 Feb 2002, Damien Morton wrote:

...
OCR is hard

OCR is hard mostly because of the analog components (and the variety of fonts that exist). If you are generating the image digitally (and with a limited set of fonts), most of the OCR problems go away.

Youre assuming a simplistic implementation. The use of a single font, and the absence of noise or distortion. At any rate, its certainly much harder than writing a perl regex, both in terms of brainpower and in terms of computing power required.

...

...
Some examples of reverse turing tests. (http://www.captcha.net/)

It appears that each of those introduces non ADA compliant aspects. The first and third can be defeated with a database no larger than that needed to implement it, the third is unlikely to work on many platforms (audio dependancies kept it from working for me), and the fourth I couldn't even figure out as a human--not what we're looking for.

Youre assuming a simplistic implementation; a database of words and images. A sophisticated implementation would generate images from random words with random distortions added, sounds by overlaying random words with random backgrounds.

You've also ignoring the third test, which is list membership. If youre not capable of passing the reverse turing tests offered, you can always join the list for unobscured access.

...

...
Between an audio test and a visual test, you've got the blind and the deaf covered.

And you've introduced lots of browser/platform dependancies that mean you can't use new low-bandwidth platforms, like WAP.

You're ignoring the third test offered, which is list membership. 'enter your email address and password here'.

Between the three kinds of tests, a person who desires at least the same functionality as is offered today, can do so, no matter what platform they are on.

Let me reiterate that what is being proposed here is the obscuration of email addresses in the public archives; that is, the archives available to the world for casual inspection.

Perhaps it might be fruitfull to look at omitting the email addresses in the public archives entirely. That would certainly be ADA compliant, and would be useable by anyone with any html 1.0 capable browser.

As I see it, the questions are:

Is it desireable to prevent the whole world seeing email addresses in mailman archives? If yes then should there be public and private archives, with the public archive obscuring addresses? if yes how should the access to the private archives be controlled? list membership? reverse truing tests? other? what should go into the public archives? obscured email? email as images? text based obfuscation? links to web form email? omit email addresses entirely? other? else if no should an obfuscation scheme be used at all? if yes what obfuscation scheme(s) should be used? obscured email? email as images? text based obfuscation? links to web form email? omit email addresses entirely? other? else if no talking in circles else if no end of conversation

Dale Newfield

9:15 a.m.

New subject: Interesting study -- spam on postedaddresses...

On Thu, 21 Feb 2002, Damien Morton wrote:

...

  should an obfuscation scheme be used at all?
  if yes
  	what obfuscation scheme(s) should be used?
  		obscured email?
  			email as images?
  			text based obfuscation?
**** links to web form email? **** omit email addresses entirely? other?

The two I marked with (****) above are not obfuscation schemes. They involve not the obfuscation of information, but rather it's removal. (While there are reasons webforms are evil, they still provide a way to contact people whose email addresses have been removed.)

This is what I am advocating.

-Dale

Chuq Von Rospach

9:23 a.m.

New subject: Interesting study -- spam on postedaddresses...

On 2/21/02 8:28 AM, "Dale Newfield" <dale@newfield.org> wrote:

...

On Thu, 21 Feb 2002, Damien Morton wrote:

...
Making a private archive available to those who are list members

I haven't commented on this before, but the reason I find this solution lacking is that most mailman lists (in my experience) don't require list admin permission to join. If this is the hurdle, as a spammer I'd just create a hotmail account that I can automatically subscribe to any mailman mailing list, and then gain access to the honeypot.

This hits another aspect of my design philosophy. Don't sweat making one part of the system more secure than the other parts.

In this case, you hit a nail on the head. If a spammer really, really wants your subscribers, we can't stop him. They can simply subscribe to a list and harvest it as it comes across. Unless you choose to anonymize every bloody message -- a spammer will win if they're motivated enough, and a smart spammer will do so in a way you'll never find. Like setting up a hotmail address for each list, so you can't see that all 30 lists have the same address in common, and simply reading messages as they come by.

And since, inherently, you can't stop THAT, it makes no sense to make archives more secure than that. Any spammer smart enough to be willing to subscribe to a list to do their harvesting, you're going to have a very tough time stopping. Basically, you have to get lucky or hope they make a mistake or some sort.

So since you can't make the subscription process more secure than that -- why try to make the archives more secure than the subscription process? It's extra work for no real gain, because any spammer will a clue will go through the patio door in the backyard instead of the front door with the three deadlocks and the security gate...

-- Chuq Von Rospach, Architech chuqui@plaidworks.com -- http://www.chuqui.com/

Yes, I am an agent of Satan, but my duties are largely ceremonial.

Damien Morton

10:32 a.m.

New subject: Interesting study -- spam onpostedaddresses...

Interestingly enough, the first place I ever saw the reverse turing test in use was in the signup for a yahoo account.

"This step helps Yahoo! prevent automated registrations." http://edit.my.yahoo.com/config/eval_register?.partner=&.intl=us&.src=my &.last=

The objective should be to raise the cost of harvesting. As you say, it cant be prevented, but forcing a human into the loop can raise the cost substantially.

...

-----Original Message----- From: mailman-developers-admin@python.org [mailto:mailman-developers-admin@python.org] On Behalf Of Chuq Von Rospach Sent: Thursday, 21 February 2002 12:24 To: Dale Newfield; mailman-developers@python.org Subject: Re: [Mailman-Developers] Interesting study -- spam onpostedaddresses...

On 2/21/02 8:28 AM, "Dale Newfield" <dale@newfield.org> wrote:

...
On Thu, 21 Feb 2002, Damien Morton wrote:

...
Making a private archive available to those who are list members

I haven't commented on this before, but the reason I find this solution lacking is that most mailman lists (in my experience) don't require list admin permission to join. If this is the hurdle, as a spammer I'd just create a hotmail account that I can automatically subscribe to any mailman mailing list, and then gain access to the honeypot.

This hits another aspect of my design philosophy. Don't sweat making one part of the system more secure than the other parts.

In this case, you hit a nail on the head. If a spammer really, really wants your subscribers, we can't stop him. They can simply subscribe to a list and harvest it as it comes across. Unless you choose to anonymize every bloody message -- a spammer will win if they're motivated enough, and a smart spammer will do so in a way you'll never find. Like setting up a hotmail address for each list, so you can't see that all 30 lists have the same address in common, and simply reading messages as they come by.

And since, inherently, you can't stop THAT, it makes no sense to make archives more secure than that. Any spammer smart enough to be willing to subscribe to a list to do their harvesting, you're going to have a very tough time stopping. Basically, you have to get lucky or hope they make a mistake or some sort.

So since you can't make the subscription process more secure than that -- why try to make the archives more secure than the subscription process? It's extra work for no real gain, because any spammer will a clue will go through the patio door in the backyard instead of the front door with the three deadlocks and the security gate...

-- Chuq Von Rospach, Architech chuqui@plaidworks.com -- http://www.chuqui.com/

Yes, I am an agent of Satan, but my duties are largely ceremonial.

Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailma> n-developers

Jay R. Ashworth

1:57 p.m.

New subject: Interesting study -- spam on postedaddresses...

On Thu, Feb 21, 2002 at 08:28:13AM -0500, Damien Morton wrote:

...

...
On Wed, 20 Feb 2002, Damien Morton wrote:

...
I still think the email-address-as-jpeg solution is prohibitively expensive to reverse; effectively impossible for machines, entirely easy for people.

But it does have drawbacks.

It only works with graphical browsers.

This is true. We are in the 21st century now. Expecting a graphical client isnt such a huge leap of faith, unless we allow ourselves to be guided by recidivist or luddite lynx users and their ilk.

And Chuq says *I'm* arrogant. There are lots of people who run their graphical browsers with J/Jscript off for security and images off for the same reason (much faster browsing) that I use Lynx.

And see above about wireless browsers, and below about the blind.

And get the phuque over yourself.

...

...
It can't be enlarged for people that have poor vision.

This is true, for the public archives.

...
It can be reverse-engineered -- all they have to do is decode a single font, then they're all simple to snag.

Assuming you use a single font. Assuming you don't add some noise to the resulting image. Assuming you don't do some geometric distortion to the resulting image.

To reverse engineer, a harvester would have to examine pretty much every image it finds, OCR it with some fantastic military grade image recognition software, and see if theres an email address buried in there.

It doesn't matter, really.

...

As I said, "prohibitively expensive to reverse"

And just imaging -- yet another way to make 15 bytes into 15 kilobytes. Yeah, the network operators oughtta like that. You get a commission?

...

Replacing the email addresses with a link to a webform would be another, perfectly acceptable solution, assuming you can get over your own objections to web forms.

We seem to keep conflating the "admin mailto problem" with the "list member mailto problem"; they have fairly widely diverging solutions.

Could we please be a bit more cautious about that?

Cheers, -- jra

Jay R. Ashworth jra@baylink.com Member of the Technical Staff Baylink RFC 2100 The Suncoast Freenet The Things I Think Tampa Bay, Florida http://baylink.pitas.com +1 727 647 1274

"If you don't have a dream; how're you gonna have a dream come true?" -- Captain Sensible, The Damned (from South Pacific's "Happy Talk")

John Morton

2 p.m.

New subject: Interesting study -- spam on postedaddresses...

On Friday 22 February 2002 05:28, Dale Newfield wrote:

...

On Thu, 21 Feb 2002, Damien Morton wrote:

...
Making a private archive available to those who are list members

I haven't commented on this before, but the reason I find this solution lacking is that most mailman lists (in my experience) don't require list admin permission to join. If this is the hurdle, as a spammer I'd just create a hotmail account that I can automatically subscribe to any mailman mailing list, and then gain access to the honeypot.

I think we're really getting into wild speculation territory here. No one will bother hacking the code to automatically get new free mail accounts (this requires staying up to date with some range of mail service's cgi interface for their join function), automatically join any mailing list they find (same problem as before, coupled with having an automated way of finding lists to plunder), then going through the usual email confirmation step (ok, not hard if your mail service lets you pop mail from them).

No one is going to bother implementing and maintaining this attack while they can grep addresses straight out of Usenet, off the web and out of DNS. If at some point in the future, those sources dry up, then it might be time to rearm. If there's compeling evidence that private archives and voluntary address obfuscation methods are failing, then it's time to rearm. But let's just keep in mind that this will always be an arms race, and that at the end of the day, it's only junk mail.

John

Jay R. Ashworth

2:01 p.m.

New subject: Interesting study -- spam on postedaddresses...

On Thu, Feb 21, 2002 at 10:27:08AM -0500, Damien Morton wrote:

...

I wonder if the ADA would accept the need to obscure email addresses, and I wonder if they would accept the extra authentication step required to get at the unobscured email address? Would they understand that it protects all mailman users, including the disabled?

Stunningly unlikely...

...

Would Lynx users and other browser-disadvantaged users accept the extra authentication/authorisation step to get at the unobscured email addresses? Would they understand that it protects _them_ as well?

If each page had a link to the version of that same page that required authentication, so that I wouldn't have to go do a whole-nother damned search, yeah...

...

...
And insulting lynx users isn't a way to increase your expected life span. Go do something less controversial like arguing the advantages of vi in the emacs news groups.

Agreed, appologies to recidivists, luddites and lynx users :)

Nice to know that you understand now that those are three separate groups. :-)

Cheers, -- jra

Jay R. Ashworth jra@baylink.com Member of the Technical Staff Baylink RFC 2100 The Suncoast Freenet The Things I Think Tampa Bay, Florida http://baylink.pitas.com +1 727 647 1274

"If you don't have a dream; how're you gonna have a dream come true?" -- Captain Sensible, The Damned (from South Pacific's "Happy Talk")

Jay R. Ashworth

2:02 p.m.

New subject: Interesting study -- spam on postedaddresses...

On Thu, Feb 21, 2002 at 09:23:51AM -0800, Chuq Von Rospach wrote:

...

This hits another aspect of my design philosophy. Don't sweat making one part of the system more secure than the other parts.

And very well phrased.

...

In this case, you hit a nail on the head. If a spammer really, really wants your subscribers, we can't stop him. They can simply subscribe to a list and harvest it as it comes across. Unless you choose to anonymize every bloody message -- a spammer will win if they're motivated enough, and a smart spammer will do so in a way you'll never find. Like setting up a hotmail address for each list, so you can't see that all 30 lists have the same address in common, and simply reading messages as they come by.

And since, inherently, you can't stop THAT, it makes no sense to make archives more secure than that. Any spammer smart enough to be willing to subscribe to a list to do their harvesting, you're going to have a very tough time stopping. Basically, you have to get lucky or hope they make a mistake or some sort.

My problem is with your characterization of that as "smart". I don't think that requires a whole helluva lot of brains, myself.

...

Yes, I am an agent of Satan, but my duties are largely ceremonial.

Are you the guy who goes in the convenience store to get him cigarettes?

Cheers, -- jra

Jay R. Ashworth jra@baylink.com Member of the Technical Staff Baylink RFC 2100 The Suncoast Freenet The Things I Think Tampa Bay, Florida http://baylink.pitas.com +1 727 647 1274

"If you don't have a dream; how're you gonna have a dream come true?" -- Captain Sensible, The Damned (from South Pacific's "Happy Talk")

Chuq Von Rospach

2:20 p.m.

New subject: Interesting study -- spam on postedaddresses...

On 2/21/02 2:00 PM, "John Morton" <jwm@plain.co.nz> wrote:

...

I think we're really getting into wild speculation territory here. No one will bother hacking the code to automatically get new free mail accounts [...]

Nobody has bothered to do this YET. That we know of. But the spamhacks are evolving rapidly. More rapidly than the anti-spam hacks in many ways. I sure wouldn't depend on them never doing this. I'm not sure what we'd do if they did, either, but some aspects of it have happened to me in small ways, just not from the major spamhacks.

Fact is, if they want your subscribers, they can get them. Or more correctly, your subcribers that post -- but if everyone lurks in fear, why hav a mail list? The question is, what can we do to make it as tough as we can for the spammers, without screwing it up for us (as admins) or our list users. If only because the harder we make it for them to hack us, they more likely they'll go somewhere else that's easier to crack...

On the other hand, if Mailman does become the de-factor mail list standard, or one of a couple of key list servers, you can bet the spam ahcks will focus on it, because if they can crack the code, they can crack a LOT of lists really fast. So we have the potential to become a target of our success, and we should be aware of that.

...

No one is going to bother implementing and maintaining this attack while they can grep addresses straight out of Usenet, off the web and out of DNS.

The "low hanging fruit" theory, or as I used yesterday, it's "the club" mentality. The Club (which, for those who don't catch my reference) is a big hunk o' steel you lock to your steering wheel. It's ability to slow down a car thief boils down to two things: how badly the thief wants YOUR car (vs. Any car), and how many other cars they can steal more easily.

But what happens when other groups get smart too, and clean up the low hanging fruit? Depending on that to protect us is a false security, basically no better than the old security-by-obscurity issue. Given port scanners and the like, there IS no obscurity from the crackers any more.

-- Chuq Von Rospach, Architech chuqui@plaidworks.com -- http://www.chuqui.com/

Stress is when you wake up screaming and you realize you haven't fallen asleep yet.

Damien Morton

2:21 p.m.

New subject: Interesting study -- spam on postedaddresses...

...

From: Jay R. Ashworth

On Thu, Feb 21, 2002 at 10:27:08AM -0500, Damien Morton wrote:

...
I wonder if the ADA would accept the need to obscure email addresses, and I wonder if they would accept the extra authentication step required to get at the unobscured email address? Would they understand that it protects all mailman users, including the disabled?

Stunningly unlikely...

Would they accept it if the email addresses were omitted in the public archives, and an extra authentication step required to get into the private archives.

These might be a questions better answered by the ADA themselves.

...

...
Would Lynx users and other browser-disadvantaged users accept the extra authentication/authorisation step to get at the unobscured email addresses? Would they understand that it protects _them_ as well?

If each page had a link to the version of that same page that required authentication, so that I wouldn't have to go do a whole-nother damned search, yeah...

Cookies can be your friend.

Remember, we arent talking about putting obstacles in the way of getting access to pages, but merely to pages that contain raw email addresses.

John W Baxter

3:56 p.m.

New subject: Interesting study -- spam on postedaddresses...

At 12:15 -0500 2/21/2002, Dale Newfield wrote:

...

The two I marked with (****) above are not obfuscation schemes. They involve not the obfuscation of information, but rather it's removal.

Oh, good...another debating point. Is removal the limiting case of obfuscation, or something different in kind? ;-)

--John

-- John Baxter jwblist@olympus.net Port Ludlow, WA, USA

John Morton

5:25 p.m.

New subject: Interesting study -- spam on postedaddresses...

On Friday 22 February 2002 11:20, Chuq Von Rospach wrote:

...

On 2/21/02 2:00 PM, "John Morton" <jwm@plain.co.nz> wrote:

...
I think we're really getting into wild speculation territory here. No one will bother hacking the code to automatically get new free mail accounts [...]

Nobody has bothered to do this YET. That we know of. But the spamhacks are evolving rapidly.

Well, let's find out shall we? Set up a honeypot private list containing a collection of free mail accounts, then cycle through the account every week checking for spam and making some postings to keep the traffic up. Enough with the armchair anthropology, already!

...

More rapidly than the anti-spam hacks in many ways. I sure wouldn't depend on them never doing this.

I agree. That's because we're in an arms race here. Email harvesters are probably evolving faster than the countermeasures because of the tendency to deploy one countermeasure and forget about it.

...

I'm not sure what we'd do if they did, either, but some aspects of it have happened to me in small ways, just not from the major spamhacks.

So basically you need to deploy a countermeasure, monitor it's effectiveness, and deploy another when it fails. Repeat for as long as you consider it important, or can tolerate not resorting to private archives, and establishing better trust relationships with the subscribers.

...

Fact is, if they want your subscribers, they can get them. Or more correctly, your subscribers that post -- but if everyone lurks in fear, why hav a mail list?

I think we all need to take a deep breath and say 'It's only junkmail'. They're not spending up large on your credit card or pouring sugar into your gas tank.

...

The question is, what can we do to make it as tough as we can for the spammers, without screwing it up for us (as admins) or our list users. If only because the harder we make it for them to hack us, they more likely they'll go somewhere else that's easier to crack...

Right. So let's go with contact forms for list admins, and slashdot style, per user configurable address mangling for the archives. Let's do a little research into the ongoing effectiveness of these methods, too, so we know when it's time to implement something more expensive.

...

On the other hand, if Mailman does become the de-factor mail list standard, or one of a couple of key list servers, you can bet the spam ahcks will focus on it, because if they can crack the code, they can crack a LOT of lists really fast. So we have the potential to become a target of our success, and we should be aware of that.

It's probably one of the top three or four already. Do listserv and majordomo admins have a major spam problem?

The two above techniques will work fine. If I, as a list subscriber feel that the signal to noise ratio is dropping, I can change my address mangling scheme. Hiding the otherwise web published list admin address behind a form should just about protect it by that vector for all time as it will just never be worthwhile hitting a collection of forms when you can get vast lists of addresses elsewhere.

(of course you have to publish the mailing list address, so you can deduce the admin address from that...:-)

...

But what happens when other groups get smart too, and clean up the low hanging fruit? Depending on that to protect us is a false security, basically no better than the old security-by-obscurity issue. Given port scanners and the like, there IS no obscurity from the crackers any more.

The problem with obscurity as a security tool is that it's not reliable. You may as well assume that one day your secret will be out, so the decision as to where and how to deploy it needs to be made based on the cost of obscuring, the cost of having the information revealed, the cost of reimplementing the system to replace the obscured part, and the size of the window of opportunity created before you can fix the problem.

Passwords, passphrases, keys and so forth are all examples of security through obscurity. They work because it's very easy to replace them; they work most effectively in systems that are good at detecting that they've been compromised. Striping identity strings from network daemons is another security through obscurity technique. No one would rely on it to protect them, but it does make the job harder for attackers and easier for the defenders - they have to try a lot more things to detect what software is behind the port, or just brute force it with known attacks, greatly increasing the detection and response time available to defenders.

Obscurity is useful. In our case, it's the only prevention tool we have. Email addresses are not secrets, but we still want to protect them from the bad guys while making them available to the good guys. We will never solve this problem. Even if you made all the subscribers sign a contract promising not reveal the addresses of the list membership to non-subscribers, and had reason to trust that they where not spammers. Someone could always go over to the dark side. An outlook virus could alway be used as the spam vector. And so on.

The best we can do here is implement something simple now that gets the job done, and continuously test it to see if it's still good enough. When it's not, we build another countermeasure.

John

Peter C. Norton

5:30 p.m.

New subject: Can someone change the subject?

OK, the original article has long ago ceased to be an interesting one, and the conversation in general has veered towars raw flamage. Please, whoever wants to continue this please change the subject to read something like "overblown flamewar".

-Peter

-- The 5 year plan: In five years we'll make up another plan. Or just re-use this one.

Chuq Von Rospach

5:36 p.m.

New subject: Interesting study -- spam on postedaddresses...

On 2/21/02 5:25 PM, "John Morton" <jwm@plain.co.nz> wrote:

...

...
Nobody has bothered to do this YET. That we know of. But the spamhacks are evolving rapidly.

Well, let's find out shall we? Set up a honeypot private list containing a collection of free mail accounts, then cycle through the account every week checking for spam and making some postings to keep the traffic up. Enough with the armchair anthropology, already!

Um, John? I've been doing that for months. It's a standard tactic I use to test for archive harvests. No offense, but given I'd already thought of the "subscribe and harvest" attack, wouldn't you think I also would have looked for ways to detect it?

I just don't like to talk about it. One has to think the harvesters are listening. I don't like giving away too many secrets -- but at the same time, it's something we have ot share ideas and concepts over...

...

So basically you need to deploy a countermeasure, monitor it's effectiveness, and deploy another when it fails. Repeat for as long as you consider it important, or can tolerate not resorting to private archives, and establishing better trust relationships with the subscribers.

Yup. Sounds familiar.

...

...
Fact is, if they want your subscribers, they can get them. Or more correctly, your subscribers that post -- but if everyone lurks in fear, why hav a mail list?

I think we all need to take a deep breath and say 'It's only junkmail'. They're not spending up large on your credit card or pouring sugar into your gas tank.

I won't argue. I expect Jay will pop up shortly and do it for me. Which is, I think, the point. Just because you aren't too sensitive to the mail doesn't mean others aren't -- so we have to keep all of the views in mind. And this is a case where I actually side more on your view, but still understand the need to manage this for those that don't have my tolerance level.

...

It's probably one of the top three or four already. Do listserv and majordomo admins have a major spam problem?

Majordomo I did. Majordomo II? I dunno. Ditto listserv. I simply haven't looked.

...

(of course you have to publish the mailing list address, so you can deduce the admin address from that...:-)

Only if you don't change them. Making them standard might not be a good idea, once they're hidden behind contact forms.

...

The problem with obscurity as a security tool is that it's not reliable.

It only works until it fails, and then you can't fix it. And I've found it invariably fails at 10PM on a Friday night, when you're about to leave for the weekend -- unless it's 2PM on a Thursday with a Friday deadline.

...

Obscurity is useful. In our case, it's the only prevention tool we have.

I'm not sure obscurity is the right word. Most of what we're talking about is more of a cloaking effort.

-- Chuq Von Rospach, Architech chuqui@plaidworks.com -- http://www.chuqui.com/

The first rule of holes: If you are in one, stop digging.

John Morton

6:38 p.m.

New subject: Save the world from spam

On Friday 22 February 2002 14:36, Chuq Von Rospach wrote:

...

On 2/21/02 5:25 PM, "John Morton" <jwm@plain.co.nz> wrote:

...
...
Nobody has bothered to do this YET. That we know of. But the spamhacks are evolving rapidly.

Well, let's find out shall we? Set up a honeypot private list containing a collection of free mail accounts, then cycle through the account every week checking for spam and making some postings to keep the traffic up. Enough with the armchair anthropology, already!

Um, John? I've been doing that for months. It's a standard tactic I use to test for archive harvests. No offense, but given I'd already thought of the "subscribe and harvest" attack, wouldn't you think I also would have looked for ways to detect it?

Excellent. Would you mind publishing an analysis so we can start making some informed decisions as to what methods are effective?

...

I just don't like to talk about it. One has to think the harvesters are listening. I don't like giving away too many secrets -- but at the same time, it's something we have ot share ideas and concepts over...

Wah! Spammers aren't the NSA/Red Menace/Grey Aliens. I think we can and should be discussing what they're acutally up to if we want to find good methods of dealing with it. Don't get me started on full disclosure :-)

...

...
I think we all need to take a deep breath and say 'It's only junkmail'. They're not spending up large on your credit card or pouring sugar into your gas tank.

I won't argue. I expect Jay will pop up shortly and do it for me. Which is, I think, the point. Just because you aren't too sensitive to the mail doesn't mean others aren't -- so we have to keep all of the views in mind. And this is a case where I actually side more on your view, but still understand the need to manage this for those that don't have my tolerance level.

As a list admin, I'd like to inform my subscribers about their level of exposure, and empower them to decide whether there email address will appear in the archives, and how. I'd also like to keep the signal to noise ratio on the admin address in a tolerable state without running too great a risk of throwing the baby out with the bathwater by dropping too many legitimate crys for help along with the processed pig product.

I'd like it if mailman would help me out with these things, but I don't want to _have_ to use ADA/text only browser busting jpeg addresses and reverse turing tests, and I don't want to have to use web form access to addresses in the archive as I won't trust that code until a lot of security geeks have looked it over.

...

Only if you don't change them. Making them standard might not be a good idea, once they're hidden behind contact forms.

Check. Add that to the admin form wishlist.

...

...
The problem with obscurity as a security tool is that it's not reliable.

It only works until it fails, and then you can't fix it. And I've found it invariably fails at 10PM on a Friday night, when you're about to leave for the weekend -- unless it's 2PM on a Thursday with a Friday deadline.

As I said, obscurity works if you can replace one instance of compromised obscurity with another one fairly quickly. Works for passwords, and it can work well enough for this application, too.

...

...
Obscurity is useful. In our case, it's the only prevention tool we have.

I'm not sure obscurity is the right word. Most of what we're talking about is more of a cloaking effort.

That's because email addresses aren't secrets. If you can think of a better method than address mangling or hiding behind web forms, do tell. Personally, I'm willing to consider those good enough for the time being.

John

Chuq Von Rospach

7:36 p.m.

New subject: Save the world from spam

On 2/21/02 6:38 PM, "John Morton" <jwm@plain.co.nz> wrote:

...

...
Um, John? I've been doing that for months. It's a standard tactic I use to test for archive harvests. No offense, but given I'd already thought of the "subscribe and harvest" attack, wouldn't you think I also would have looked for ways to detect it?

Excellent. Would you mind publishing an analysis so we can start making some informed decisions as to what methods are effective?

Oh, that's easy. I haven't found evidence of any harvesting. I've also been able to find evidence of harvesting from OTHER site's lists on at least three occcasions where people complained to me my lists were being harvested.

...

Wah! Spammers aren't the NSA/Red Menace/Grey Aliens.

Whatever. You do what you think is best, I'll do what I think is best.

...

As a list admin, I'd like to inform my subscribers about their level of exposure, and empower them to decide whether there email address will appear in the archives, and how. I'd also like to keep the signal to noise ratio on the admin address in a tolerable state without running too great a risk of throwing the baby out with the bathwater by dropping too many legitimate crys for help along with the processed pig product.

I'd like it if mailman would help me out with these things, but I don't want to _have_ to use ADA/text only browser busting jpeg addresses and reverse turing tests, and I don't want to have to use web form access to addresses in the archive as I won't trust that code until a lot of security geeks have looked it over.

Understood. But -- there are going to have to be some compromises and tradeoffs made. The whole discussion was intended to look for them, because I don't believe you can have all of that successfully. Something will have to give.

...

...
...
Obscurity is useful. In our case, it's the only prevention tool we have.

I'm not sure obscurity is the right word. Most of what we're talking about is more of a cloaking effort.

That's because email addresses aren't secrets. If you can think of a better method than address mangling or hiding behind web forms, do tell. Personally, I'm willing to consider those good enough for the time being.

You know, now that I think of it, there's another approach: you don't get the admin's email address until you authenticate. Then you get it. If you're a list subscriber, you authenticate to the same level as the list is authenticated. If you're not, Mailman sends you an e-mail with the address in it (or FROM the address, so you can merely reply to it). No valid email address, no access to the admin. And if you do that, you can also set up a blackhole for known abusive addresses, shutting out the trolls..

Thoughts?

-- Chuq Von Rospach, Architech chuqui@plaidworks.com -- http://www.chuqui.com/

Very funny, Scotty. Now beam my clothes down here, will you?

John Morton

9:18 p.m.

New subject: Save the world from spam

On Friday 22 February 2002 16:36, Chuq Von Rospach wrote:

...

...
Excellent. Would you mind publishing an analysis so we can start making some informed decisions as to what methods are effective?

Oh, that's easy. I haven't found evidence of any harvesting. I've also been able to find evidence of harvesting from OTHER site's lists on at least three occcasions where people complained to me my lists were being harvested.

And those lists had publicly accessable archives with no address mangling?

...

Understood. But -- there are going to have to be some compromises and tradeoffs made. The whole discussion was intended to look for them, because I don't believe you can have all of that successfully. Something will have to give.

Yep. Almost time to go back through the thread so far and summarize the options that have been discussed, I think.

...

...
That's because email addresses aren't secrets. If you can think of a better method than address mangling or hiding behind web forms, do tell. Personally, I'm willing to consider those good enough for the time being.

You know, now that I think of it, there's another approach: you don't get the admin's email address until you authenticate. Then you get it. If you're a list subscriber, you authenticate to the same level as the list is authenticated. If you're not, Mailman sends you an e-mail with the address in it (or FROM the address, so you can merely reply to it). No valid email address, no access to the admin. And if you do that, you can also set up a blackhole for known abusive addresses, shutting out the trolls..

Thoughts?

I think the list admin address is exposed to subscribers in the welcome message and monthly reminders already; I presume you mean that to see a web page with it, you'd have to log in first.

I think the problem with this is the most likely reason that someone would email the admin if they're subscribed is because they can't log into the site to change there settings, see the archive and so on, or they're trying to subscribe to the list but the email confirmation process is failing for some reason (this has happened to me on a couple of occasions due to MTA wierdness at the list end). Naturally, failures anywhere before the email confirmation process couldn't be reported, either.

This one doesn't look to be any better than the web form, except that it might work in an email only environment. Perhaps both?

John

Dale Newfield

9:36 p.m.

New subject: Interesting study -- spam on postedaddresses...

On Fri, 22 Feb 2002, John Morton wrote:

...

The best we can do here is implement something simple now that gets the job done, and continuously test it to see if it's still good enough. When it's not, we build another countermeasure.

I completely disagree. You argue for job security. I argue for better software. Temporary solutions are not solutions.

-Dale

John Morton

9:48 p.m.

New subject: Interesting study -- spam on postedaddresses...

On Friday 22 February 2002 18:36, Dale Newfield wrote:

...

On Fri, 22 Feb 2002, John Morton wrote:

...
The best we can do here is implement something simple now that gets the job done, and continuously test it to see if it's still good enough. When it's not, we build another countermeasure.

I completely disagree. You argue for job security. I argue for better software. Temporary solutions are not solutions.

Ok. Show me a solution that will protect list administrator addresses and publicly accessable list archives from email harvesting, while allowing list subscribers and members of the public the ability to contact the list admin in the event of a list related problem and allowing them to contact an individual personally in response to some message in the archive. The solution must not penalize text only web browser users, or the disabled, nor should it open up any other vectors for unsolicited mass emailing.

I'd really like to see one.

John

Dale Newfield

9:58 p.m.

New subject: Interesting study -- spam on postedaddresses...

On Fri, 22 Feb 2002, John Morton wrote:

...

Ok. Show me a solution

The point is that adding layer after layer of temporary solutions doesn't add up to an actual solution any more than not adding those layers. All it does add is more complexity to manage, more code to write and test, more annoyance to anyone trying to use the system, and more potential points of failure. Separate archives (public stripped of anything that looks like an email address, private unmodified), and an equivilant "give me archive access" path to the subscription path (through email) as has been suggested seems to be the best solution yet.

-Dale

John Morton

10:18 p.m.

New subject: Interesting study -- spam on postedaddresses...

On Friday 22 February 2002 18:58, Dale Newfield wrote:

...

On Fri, 22 Feb 2002, John Morton wrote:

...
Ok. Show me a solution

The point is that adding layer after layer of temporary solutions doesn't add up to an actual solution any more than not adding those layers. All it does add is more complexity to manage, more code to write and test, more annoyance to anyone trying to use the system, and more potential points of failure.

This depends on just how temporary your 'solution' turns out to be, and it's level of complexity and usability. I don't think anyone has really advocate any really kludgy hacks so far.

...

Separate archives (public stripped of anything that looks like an email address, private unmodified), and an equivilant "give me archive access" path to the subscription path (through email) as has been suggested seems to be the best solution yet.

Not bad; it looks fairly easy to implement. I'd build the archive access to be just like regular list access, except delivery is turned off by default, to keep it simple.

The problem is that if you accept that those nefarious agents of mass email will start auto-joining lists and plunder the private archive and message feed for addresses sometime in the future, then you have to implement another layer of hackery to detect and block that sort of thing. Does that make your suggestion any less of an actual solution? :-)

I'd still go as far as adding per user configurability for address display so people can adjust the option to suit there own level of hysteria.

John

Dale Newfield

10:25 p.m.

New subject: Interesting study -- spam on postedaddresses...

On Fri, 22 Feb 2002, John Morton wrote:

...

The problem is that if you accept that those nefarious agents of mass email will start auto-joining lists and plunder the private archive and message feed for addresses sometime in the future, then you have to implement another layer of hackery to detect and block that sort of thing. Does that make your suggestion any less of an actual solution? :-)

As was already pointed out, if the spambots get smart enough to subscribe, they don't need the archives--they just have to wait for the addresses to appear in their mailboxes. So once they've cleared that hurdle, nothing you do to the archives will help one bit.

...

I'd still go as far as adding per user configurability for address display so people can adjust the option to suit there own level of hysteria.

That adds tremendous management headaches for both users and admins, as well as difficult coding problems (since whenever one subscriber quotes another you need to figure that out and do multiple "how should I obfuscate YOUR email adress?" lookups). Why not just ignore this non-solution and save everyone the headaches?

-Dale

Dale Newfield

10:27 p.m.

New subject: Interesting study -- spam on postedaddresses...

On Fri, 22 Feb 2002, John Morton wrote:

...

Not bad; it looks fairly easy to implement. I'd build the archive access to be just like regular list access, except delivery is turned off by default, to keep it simple.

I thought about that, but do you really want to send monthly password reminders to people that just wanted to look at the archives? (Or do we not send those to people with "nomail" set?)

-Dale

Stephen J. Turnbull

22 Feb 22 Feb

12:18 a.m.

New subject: Save the world from spam

...

...
...
...
...
"Chuq" == Chuq Von Rospach <chuqui@plaidworks.com> writes:

Chuq> You know, now that I think of it, there's another approach:
Chuq> you don't get the admin's email address until you
Chuq> authenticate. Then you get it. [...]

Chuq> Thoughts?

You've just reinvented TMDA, basically, except for the initial contact protocol being HTTP instead of SMTP. TMDA would also protect admins of publically accessible lists (yes, I know that's basaqwards, Chuq, but it would keep them from aliasing the list admin address to /dev/null). I was going to suggest that a couple days ago, but assumed it was common knowledge here.

http://sourceforge.net/projects/tmda

Jason could probably be very helpful in tuning it to this specific application.

-- Institute of Policy and Planning Sciences http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Don't ask how you can "do" free software business; ask what your business can "do for" free software.

Stephen J. Turnbull

12:36 a.m.

New subject: Interesting study -- spam on postedaddresses...

I repeat myself, but only Chuq seems to have noticed the other post.

...

...
...
...
...
"John" == John Morton <jwm@plain.co.nz> writes:

John> This depends on just how temporary your 'solution' turns out
John> to be, and it's level of complexity and usability. I don't
John> think anyone has really advocate any really kludgy hacks so
John> far.

AFAICT both the trivial /. style obfuscation and the image style obfuscation are kludges because they ignore the statistical nature of harvesting. This works in two ways.

First, since addresses are typically repeated but obfuscated in different ways, the probability that a given address gets harvested is much higher than the probability that any given obfuscated instance gets cracked. Second, you don't need to get 100% recognition, you don't even need to get 10% recognition, as long as you can process the bytes as fast as they come off the wire _and_ the number of harvested new addresses per megabyte is high enough.

There is a third, "equilibrium" problem with obfuscation. Image obfuscation has the serious drawback that it looks "provably secure" if you don't think about it carefully. If this encourages lots more people to post real addresses, the value of the harvest rises proportionately and thus obfuscation decreases achieved security.

I conclude that if obfuscated archives give a reasonable number of addresses per megabyte, and those addresses are drawn from a population that is not represented in other sources, spammers _will_ find cheap and dirty ways to achieve recognition, and then they will compete to improve it.

People have seriously advocated obfuscation, especially images.

Damien Morton

6:16 a.m.

New subject: Interesting study -- spam on postedaddresses...

...

From: Stephen J. Turnbull

First, since addresses are typically repeated but obfuscated in different ways, the probability that a given address gets harvested is much higher than the probability that any given obfuscated instance gets cracked. Second, you don't need to get 100% recognition, you don't even need to get 10% recognition, as long as you can process the bytes as fast as they come off the wire _and_ the number of harvested new addresses per megabyte is high enough.

<snip>

I conclude that if obfuscated archives give a reasonable number of addresses per megabyte, and those addresses are drawn from a population that is not represented in other sources, spammers _will_ find cheap and dirty ways to achieve recognition, and then they will compete to improve it.

People have seriously advocated obfuscation, especially images.

So obfuscation is imperfect, and the more effective it is, the more value there is in cracking it.

Would you say, then, that youre advocating public and private list archives, with email addresses omitted from the public archives, and the private archives available to list members only?

Im not clear on what your position is.

A while ago, I laid out the decision/position tree, as I saw it. Only one person has clearly located their position in/on that tree, so I repeat it again.

Im very interested to see where list members might locate their position in this decision tree. Please eel free to alter the tree, should your position not be included.

Is it desireable to prevent the whole world seeing email addresses in mailman archives? If yes then should there be public and private archives, with the public archive protecting addresses? if yes how should the access to the private archives be controlled? list membership? (damien) reverse truing tests? (damien) other? what should go into the public archives? obfuscated email? email as images? (damien) text based obfuscation? links to web form email? (damien) omit email addresses entirely? other? else if no should an address protection scheme be used at all? if yes what protection scheme(s) should be used? obscured email? email as images? text based obfuscation? links to web form email? (dale) omit email addresses entirely? (dale) other? else if no talking in circles else if no end of conversation

Stephen J. Turnbull

7:53 a.m.

New subject: Interesting study -- spam on postedaddresses...

...

...
...
...
...
"Damien" == Damien Morton <dm-temp-310102@nyc.rr.com> writes:

Damien> So obfuscation is imperfect, and the more effective it is,
Damien> the more value there is in cracking it.

That's true, but that's not what I said.

What I said is it is weak enough that a small amount of effort brings some payoff to harvesting, and the more effort, the higher the payoff. Furthermore, even though it is therefore not very effective, it's easy to convince yourself it is, and this _perception_ generates more value for spammers.

Damien> Im not clear on what your position is.

My position is that (1) obfuscation is unlikely to last 6 months after it becomes widespread, and (2) it is an unsatisfactory method for inclusion as a standard in Mailman, because it is costly to develop, and costly to all the legitimate users both in immediate inconvenience and in false sense of security, while probably not slowing down the spammers much.

Beyond that, I don't have a position; I plan to ask my subscribers/ posters how they feel about it, and treat my own lists accordingly.

Jay R. Ashworth

10:32 a.m.

New subject: Interesting study -- spam on postedaddresses...

On Fri, Feb 22, 2002 at 09:16:20AM -0500, Damien Morton wrote:

...

Is it desireable to prevent the whole world seeing email addresses in mailman archives? If yes then should there be public and private archives, with the public archive protecting addresses? if yes how should the access to the private archives be controlled? list membership? (damien) reverse truing tests? (damien) other? what should go into the public archives? obfuscated email? email as images? (damien) text based obfuscation? links to web form email? (damien) omit email addresses entirely? other? else if no should an address protection scheme be used at all? if yes what protection scheme(s) should be used? obscured email? email as images? text based obfuscation? links to web form email? (dale) omit email addresses entirely? (dale) other? else if no talking in circles

Well, this is where I sit -- and please note that I'm discussing *implementation of actual lists*, not *what facilities I think Mailman should provide*; if you feel this invalidates my opinion, so be it -- and I think that characterizing it as "talking in circle" is a bit digingenuous, at best.

Please expand.

...

else if no end of conversation

Cheers, -- jra

Jay R. Ashworth jra@baylink.com Member of the Technical Staff Baylink RFC 2100 The Suncoast Freenet The Things I Think Tampa Bay, Florida http://baylink.pitas.com +1 727 647 1274

"If you don't have a dream; how're you gonna have a dream come true?" -- Captain Sensible, The Damned (from South Pacific's "Happy Talk")

barry＠zope.com

3:32 p.m.

New subject: Can someone change the subject?

...

...
...
...
...
"PCN" == Peter C Norton <spacey-mailman@lenin.nu> writes:

PCN> OK, the original article has long ago ceased to be an
PCN> interesting one, and the conversation in general has veered
PCN> towars raw flamage.  Please, whoever wants to continue this
PCN> please change the subject to read something like "overblown
PCN> flamewar".

I love it! I go away on vacation for a week and huge threads flame up and die out while I'm gone. I guess I can ignore the whole thing now right? <wink>.

Oh, and sorry about my broken vacation program. At least it totally refused to send out any vacation messages rather than puke all over all my mailing lists with thousands of useless notices.

I'm going to try to catch up this weekend, but I'll tell you, if we had gotten as much snow as I got email while I was away (even of the non-spam variety), I'd have stayed on the slopes another week.

Watch for a bunch of checkins too. Yes, I was just sick enough to bring my laptop with me. Hey, you gotta do something after the champagne's gone.

-Barry

barry＠zope.com

25 Feb 25 Feb

8:33 a.m.

New subject: Interesting study -- spam on postedaddresses...

...

...
...
...
...
"DN" == Dale Newfield <dale@newfield.org> writes:

DN> I thought about that, but do you really want to send monthly
DN> password reminders to people that just wanted to look at the
DN> archives?  (Or do we not send those to people with "nomail"
DN> set?)

We send password reminders to folks regardless of their delivery status. This actually makes sense in a VERP world because since password reminders are by nature personalized, we can piggyback the more accurate VERP bounce detection onto them.

Note that in MM2.1, people can turn off password reminders altogether.

-Barry

8101

Age (days ago)

8105

Last active (days ago)

List overview

Download

46 comments

10 participants

participants (10)

barry＠zope.com
Chuq Von Rospach
Dale Newfield
Damien Morton
Jay R. Ashworth
John Morton
John W Baxter
Nigel Metheringham
Peter C. Norton
Stephen J. Turnbull

RE: [Mailman-Developers] Interesting study -- spam on postedaddresses...

Chuq Von Rospach

John Morton

Chuq Von Rospach

John Morton

John Morton

John W Baxter

John W Baxter

Damien Morton

Damien Morton

Chuq Von Rospach

Damien Morton

Chuq Von Rospach

Damien Morton

Cheers, -- jra

John Morton

Cheers, -- jra

Cheers, -- jra

Chuq Von Rospach

Damien Morton

John W Baxter

John Morton

Peter C. Norton

Chuq Von Rospach

John Morton

Chuq Von Rospach

John Morton

John Morton

John Morton

Damien Morton

Cheers, -- jra

tags

participants (10)