[scikit-learn] Naive Bayes - Multinomial Naive Bayes tf-idf
Andy
t3kcit at gmail.com
Fri Nov 4 10:43:36 EDT 2016
On 11/04/2016 05:45 AM, Marcin Mirończuk wrote:
> Hi,
> In our experiments, we use a Multinomial Naive Bayes (MNB). The
> traditional MNB implies the TF weight of the words. We read in
> documentation http://scikit-learn.org/stable/modules/naive_bayes.html
> which describes Multinomial Naive Bayes that "... where the data are
> typically represented as word vector counts, although tf-idf vectors
> are also known to work well in practice". The "word vector counts" is
> a TF and it is well known. We have a problem which the "tf-idf
> vectors". In this case, i.e. tf-idf it was implemented the approach
> of the D. M. Rennie et all Tackling the Poor Assumptions of Naive
> Bayes Text Classification? In the documentation, there are not any
> citation of this solution.
No, I think that paper implements something slightly different. The
documentation says that you can apply the TfidfVectorizer instead of
CountVectorizer and it can still work.
More information about the scikit-learn
mailing list