[scikit-learn] What are the stopwords used by CountVectorizer?

Jonathan Cusick jonathan.cusick09 at gmail.com
Mon Jan 27 15:53:08 EST 2020


Hi Peng,

I believe the set of English stop words used across all token vectorizers
can be found in
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_extraction/_stop_words.py.


Cheers,
Jon

On Mon, Jan 27, 2020 at 3:33 PM Peng Yu <pengyu.ut at gmail.com> wrote:

> Hi,
>
> I don't see what stopwords are used by CountVectorizer with
> stop_wordsstring = ‘english’.
>
>
> https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html
>
> Is there a way to figure it out? Thanks.
>
> --
> Regards,
> Peng
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20200127/c895c0ef/attachment.html>


More information about the scikit-learn mailing list