[Spambayes] How do you classify text?
tim.one at comcast.net
Wed Apr 23 12:19:12 EDT 2003
> I'm working in a project that must classify a paragraph as one among
> N subjects. I would like to know exactly how you take a paragraph and
> classify it; how do you train the filter?.
> I would like to apply bayesian rules to distinguish among N
> differents subjects which a paragraph is talking about.
> I hope that you help because it's an important project for me.
The spambayes project doesn't (despite its name <wink>) do Bayesian
classification, or N-way classification. A good paper on a good system that
does both is Jason Rennie's
"ifile: An Application of Machine Learning to E-Mail Filtering"
The paper summarizes the classic Bayesian classification approach. Do learn
how to use citeseer: it's a great way to find papers on tech subjects! The
citeseer record for the paper above is:
More information about the Spambayes