[Spambayes] bayesian research
Anthony Baxter
anthony at interlink.com.au
Mon Dec 30 18:28:26 EST 2002
>>> "alan" wrote
> Hi,
> I have been given the task of reseaching Bayesian mail filters for my
> final year Univeristy dissertaion.
> I have been finding brick walls at every turn.
> I know paul graham is great, but just about every one talk about his plan for
> spam, but i need a start place.
> I have set my system to allow relays and have 1000's of spam examples.
> Any ideas where i should start?
Look at the 'background' page on our website, for starters. Note that
you don't just need a collection of spam - you also want some of the
"real email" (we call it 'ham') that went with the spam. You can start
with differently sourced ham and spam, but you've then got a problem
with false clues (e.g. different header 'Received' lines from the
different mail systems).
For further info, download the code and read the source - it's heavily
commented, and there's a whooole pile of nice information in there.
It's probably also worth noting that this project has pretty much
tossed out the Graham algorithm.
Anthony
--
Anthony Baxter <anthony at interlink.com.au>
It's never too late to have a happy childhood.
More information about the Spambayes
mailing list