[Spambayes] Numeric python store, hammiefilter extension and mutt
Tim Stone - Four Stones Expressions
Fri Nov 22 00:42:57 2002
Sounds really good, Adam. Neale Pickett and I have been working on this kind
of stuff in a branch named hammie-playground. There have been some
substantial changes made there, that'll be merged into the main thread soon.
You might want to check there and see how your changes would fit in... I
really like your results. Size and speed have been consistent challenges for
11/21/2002 6:27:13 PM, Adam Hupp <firstname.lastname@example.org> wrote:
>I've been working on a store for spambayes that uses the Numeric
>python extension. It's substantially faster than PersistentBayes and
>the database is about half the size. A comparison, training on 992 messages:
>score 1 msg: .45s
>score 6156 msgs: 58s
>score 1 msg: .59s
>score 6156 msgs: 49s
>There are no modifications to classifier.Bayes, it just uses a new
>WordInfo class with properties.
>I also modified hammiefilter to do untraining, retraining, and
>training on filter results. For example:
>hammiefilter.py --filter --train
>The incoming message is scored and filtered. If the result is not
>"Unsure" the classifier will be trained on it.
>hammiefilter.py --reverse --good --train
>The incoming message has previously been incorrectly marked as ham.
>--reverse will untrain the classifier and --train will retrain it on
>the message as spam.
>With these tools it's straightforward to setup macros in mutt to
>manage false negatives/positives and classify "Unsure" messages.
>The modified files can be found at:
>hammiefilter requires Optik and the NumericBayes store requires
>Numeric and MaskedArray (and optional part of Numeric).
>Spambayes mailing list
More information about the Spambayes