[spambayes-dev] spammy subject lines

Tony Meyer tameyer at ihug.co.nz
Tue Oct 14 18:30:57 EDT 2003


> Some of the original shootout tests were done with a minimum 
> of 2000 each ham/spam messages divided into 10 buckets (for 
> 200 per bucket). Of course, more is better, but since Tim 
> said 200 for a useful lower bound back then, I'll trust him. :-)

:)  I've updated the instructions to say 'about 200-500 messages'.

[Tony Meyer]
>Would a better move be to update cmp.py so that it does know about 
>unsures?

[Alex]
> +1

Apparently (it was news to me ;) I asked this very same question a few
months back.  Tim said that it would be quite a substantial work, since it
wasn't designed to do that.  I think it's pretty obvious ;) that I'm not the
right person to be mucking about updating the testing scripts, so unless
someone else wants to take up the challenge, we'll have to leave it, I
guess.

> Eh, am I the only one around here who never throws away mail? 
> I've for 75K+ messages, over 2/3 of which is spam, collected 
> over the last year...

I don't throw away any mail, unless it's available in an archive somewhere.
I used to keep it all, but then I went through a period of having not a lot
of space available on the system I read mail with, so had to give it up :)

=Tony Meyer




More information about the spambayes-dev mailing list