[Mailman-Developers] Interesting study -- spam onpostedaddresses...
Stephen J. Turnbull
21 Feb 2002 13:23:48 +0900
>>>>> "Chuq" == Chuq Von Rospach <email@example.com> writes:
Chuq> On 2/20/02 1:37 PM, "Damien Morton"
Chuq> <firstname.lastname@example.org> wrote:
>> As far as I can see thay are using url/cgi encoding in the
>> email address. This is trivial to circumvent, as is using html
>> entities, or any other reversible scheme.
Chuq> With a constantly varying algorithm. So they obfuscate, but
Chuq> they never obfuscate in a predictable way. Which means if
Chuq> you're a spambot, you have to look at every byte of every
Chuq> page and attempt to de-obfuscate it in every possible way to
Chuq> see if it's obfuscated. You CAN do it, but you make it
Chuq> computationally massively expensive.
Er, last I heard "massively expensive" ~ "exponential". This is
O(n*m) where _n_ is the number of bytes and _m_ is the number of
obfuscations, and _m_ is bounded by user patience.
Nor do the spammers need to deobfuscate all the obfuscations. They
only need enough that they're getting a reasonable harvest rate. But
the people who post to /. etc tend to be repeat offenders, and the
obfuscation is random. So we lose as soon as the amount of address
content obfuscated in this way becomes noticable.
And maybe before that, as many spammers seem to take address-hiding as
a personal offense, in the same way that crackers view passwords.
Institute of Policy and Planning Sciences http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
Don't ask how you can "do" free software business;
ask what your business can "do for" free software.