
Aug. 21, 2002
8:23 a.m.
Paul Prescod <paul@prescod.net>:
Some perhaps relevant links (with no off-topic discusssion):
I'm in the process of speed-tuning this now. I intend for it to be blazingly fast, usable for sites that process 100K mails a day, and I think I know how to do that. This is not a natural application for Python :-).
"""My finding is that it is _nowhere_ near sufficient to have two populations, "spam" versus "not spam."
Well, except it seems to work quite well. The Nigerian trigger-word population is distinct from the penis-enlargement population, but they both show up under Bayesian analysis. -- <a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>