<div dir="ltr"><div dir="ltr"><div>Hi Peng,</div><div><br></div><div>I believe the set of English stop words used across all token vectorizers can be found in <a href="https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_extraction/_stop_words.py">https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_extraction/_stop_words.py</a>. <br></div><div><br></div><div>Cheers,<br></div><div>Jon<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Jan 27, 2020 at 3:33 PM Peng Yu <<a href="mailto:pengyu.ut@gmail.com">pengyu.ut@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>
<br>
I don't see what stopwords are used by CountVectorizer with<br>
stop_wordsstring = ‘english’.<br>
<br>
<a href="https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html" rel="noreferrer" target="_blank">https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html</a><br>
<br>
Is there a way to figure it out? Thanks.<br>
<br>
-- <br>
Regards,<br>
Peng<br>
_______________________________________________<br>
scikit-learn mailing list<br>
<a href="mailto:scikit-learn@python.org" target="_blank">scikit-learn@python.org</a><br>
<a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/mailman/listinfo/scikit-learn</a><br>
</blockquote></div></div>