[spambayes-dev] An unrelated idea: categorization / cluster
analysis of text files for FAQ generating
Anssi Porttikivi
anssi.porttikivi at teleware.fi
Thu Sep 25 09:05:30 EDT 2003
Sorry to bother you, but I would like to know, if anyone here has any
knowledge of technologies like the following idea:
Could you automatically categorize a set of messages into an optimum
number of cluster subsets, where messages inside a subset would be
similar to each other, in bayesian filtering terms. If this could be
done without a priori manually selecting the categories that the
clusters subset are, this could be used for an automated "frequently
asked questions" list manitenance. Automatic categorization of incoming
mail without manually choosing any criteria beforehand would also be
interesting.
More information about the spambayes-dev
mailing list