[Spambayes] Introducing myself
Matt Sergeant
msergeant@startechgroup.co.uk
Mon Nov 11 09:49:38 2002
Robert Woodhead said the following on 10/11/02 00:32:
> * My personal bias (as I think Guido mentioned) is for a multifaceted
> approach, using Bayesian, rules-based (attacking things that bayesian
> isn't good at, like looking for obfuscated url structures), DNSBL,
> and whitelisting heuristics to generate an overall ranking. So a
> hammy mail from a guy in your address book would bubble up to highest
> priority, whereas something spammy from him would stay neutral.
> There's lots of room for cooperation between the various approaches
> and multiple agents means its less likely that a spam will get by.
> In particular, whitelisting heuristics can almost eliminate false
> positives.
That's the approach SpamAssassin now takes, fwiw (including the bayesian
stuff). All done in 2.50 CVS.
> * Finally, if anyone needs more spam, I get over 300 a day (I've been
> around a while!) and have a cleaned corpus of over 130MB of spam and
> foreign email. Also, given all the legit web-marketing email I get
> because of the url registration work I've done, I've got tons of the
> spammiest ham you could imagine.
I'm always looking for more corpuses. Stick the data on an FTP/HTTP
server somewhere (password protect if you need to). Or contact me
privately if that's not possible.
Matt.
More information about the Spambayes
mailing list