[Spambayes] proposed changes to hammie & co.
Thu Nov 21 06:08:00 2002
So then, "T. Alexander Popiel" <firstname.lastname@example.org> is all like:
> In message: <email@example.com>
> Neale Pickett <firstname.lastname@example.org> writes:
> >I'm currently entwined with mucking the heck out of WordInfo. I've got
> >a neato scheme based on Alex's patch and comments where the WordInfo
> >classes still compute their own probabilities, but also keep a revision
> >number which is compared against a MetaInfo class.
> Eww, do we gotta? I thought I was trying to make the DB smaller. ;-)
Ah, but the only thing *stored* is (spamcount, hamcount). The
probability is calculated the first time you ask for it. If you don't
update nspam or nham, the next time you ask for it it gives the cached
value. So the database is small, but you still get the in-memory
probability caching if you're using a pickle or ZODB.
But now that words are computing their own probabilities, the Bayes
class no longer does anything Bayesian. I guess it's time to rename
that class to Classifier.
> >The neato thing here, at least from the perspective of DBDict, is
> >that all the meta information is now bundled up in a handy object.
> This is unalloyed good.
"unalloyed" is a superb word, Alex. It reminds me that I should be
studying for the GRE instead of hacking spam classifier code :)
More information about the Spambayes