[Spambayes] CRM114 in November breaks 99.9%. :-)
Matt Sergeant
msergeant@startechgroup.co.uk
Mon Dec 2 15:22:23 2002
Bill Yerazunis said the following on 02/12/02 14:44:
> Final test statistics for CRM114 for November are in:
>
> Standard rules apply (no whitelists, no blacklists, realtime email stream
> only (no "canned spam"), train only on errors, polynomial length 5)
>
> For All of November (starting 9 AM Nov 1, ending 9 AM Dec 1)
>
> Spams Nonspams False False Total N+1 Accuracy NHC's
> Accepts Rejects Emails
> 1993 3914 4 0 5911 99.915 2
>
> Spam features in hash tables: 398K
> Nonspam features in hash tables: 299K
CRM114's learn and classify stuff looks really interesting, but it has a
really freaky syntax to someone who is used to regular procedural or OO
languages like Perl, Python, C, etc. Is there *any* chance the library
in crm114 for learning and classifying can be extracted into a plain
.so? That would be tremendous, and I'd be willing to build a perl XS
library for it in a heartbeat.
If not, we'll just have to try and copy the sparse binary polynomial
hash idea ;-)
More information about the Spambayes
mailing list