[Spambayes] Latest spammer trick stymied
richard at jowsey.com
Tue Apr 1 21:02:54 EST 2003
> From: Tim Stone - Four Stones Expressions
> > That's right. We really should try to solve this problem
> > with tokenization.
> Silly question, but is there actually a problem? The system isn't
> expected to be 100% perfect. Is this happening often enough to justify
> the effort?
That's a very good question, actually. IMHO, it's happening often
enough when your inbox is normally 99.9% spam-free, but suddenly, a
few of these low-mass particles start sneaking through...
> I get a reasonable number of virus mails from big at boss.com, they
> generally come in as "unsure". After I train on 5 or 6 of them,
> they start coming in as spam. No problem. Won't this work here
> as well?
Apparently not. My proxy catches viruses too, real well! This is a
bit different, in that these subatomics are sent from randomly
generated sub-domains, with randomized senders, etc. Thus, minimal
and rapidly-changing clue sets. There's just no good way to train on
them quickly enough. It's damn annoying, is what...
> If the issue is with the person who was surprised that Spambayes
> didn't identify an "obvious" spam, maybe it's just an education
Nope, the tester in question is a very educated consumer. I can see
where you're going, but the general public expects a so-called
"filtering proxy service" to work 100% of the time. And they're
perplexed when it misses something they think is obvious.
But let's not worry about a URL slurper getting into the core
SpamBayes code. It probably shouldn't. But certain individuals might
want to experiment with the notion, and that's the kind of real-world
testing that can only improve an already extraordinarily intelligent
mail filter. Which is a Good Thing, I reckon... :-)
More information about the Spambayes