[scikit-learn] why the modification in the df-idf formula?

Sole Galli solegalli at protonmail.com
Tue May 28 10:40:58 EDT 2024


Hi guys,

I'd like to understand why sklearn's implementation of tf-idf is different from the standard textbook notation as described in the docs: https://scikit-learn.org/stable/modules/feature_extraction.html#tfidf-term-weighting

Do you have any reference that I could take a look at? I didn't manage to find them in the docs, maybe I missed something?

Thank you!

Best wishes
Sole
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20240528/23e7bf4a/attachment.html>


More information about the scikit-learn mailing list