[Mailman-Developers] Interesting study -- spam on postedaddresses...

22 Feb 2002 17:36:28 +0900

I repeat myself, but only Chuq seems to have noticed the other post.

>>>>> "John" == John Morton <jwm@plain.co.nz> writes:

    John> This depends on just how temporary your 'solution' turns out
    John> to be, and it's level of complexity and usability. I don't
    John> think anyone has really advocate any really kludgy hacks so
    John> far.

AFAICT both the trivial /. style obfuscation and the image style
obfuscation are kludges because they ignore the statistical nature of
harvesting.  This works in two ways.

First, since addresses are typically repeated but obfuscated in
different ways, the probability that a given address gets harvested is
much higher than the probability that any given obfuscated instance
gets cracked.  Second, you don't need to get 100% recognition, you
don't even need to get 10% recognition, as long as you can process the
bytes as fast as they come off the wire _and_ the number of harvested
new addresses per megabyte is high enough.

There is a third, "equilibrium" problem with obfuscation.  Image
obfuscation has the serious drawback that it looks "provably secure"
if you don't think about it carefully.  If this encourages lots more
people to post real addresses, the value of the harvest rises
proportionately and thus obfuscation decreases achieved security.

I conclude that if obfuscated archives give a reasonable number of
addresses per megabyte, and those addresses are drawn from a
population that is not represented in other sources, spammers _will_
find cheap and dirty ways to achieve recognition, and then they will
compete to improve it.

People have seriously advocated obfuscation, especially images.

-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
              Don't ask how you can "do" free software business;
              ask what your business can "do for" free software.