[Spambayes] Latest spammer trick stymied

Tim Peters tim.one at comcast.net
Tue Apr 1 22:30:00 EST 2003

[Tim Stone]
> ...
> So what's your take on the slurping thing, Tim?

It could be valuable, although it seems more at home in a central (shared)
server kind of scheme, where the expenses (on all sides) of fetching content
can be incurred once for the benefit of many (I'm picturing a shared
dict/database mapping a URL to a token sequence -- there's no "ham or spam?"
judgment there, just a one-time fetching and pre-digesting of the referenced

One twist I didn't see mentioned is that spam web sites often get shut down
quickly, so failure to resolve a URL would be a useful (& sometimes
expensive (in time) to obtain!) clue too.

The spambayes system has always scratched its head over (a) very short msgs,
and (b) long, chatty, "just folks" spam.  Fetching URL content could improve
classification of both.  The OP's scheme of invoking it only when the score
would otherwise be unsure was a neat idea.

Integrating blacklist lookups as part of header analysis would be similar
(IMO) in many ways.

OTOH, I don't exepct 1-URL spam to survive -- there's no motivation to click
the link.

