[scikit-learn] How to make sure stop words are matched when lowercase=False?

Peng Yu pengyu.ut at gmail.com
Tue Jan 28 09:47:13 EST 2020


Hi,

https://github.com/scikit-learn/scikit-learn/blob/002f891a33b612be389d9c488699db5689753ef4/sklearn/feature_extraction/text.py#L587

The default of lowercase is True. But stopwords are lower case. Where
is the code to make sure the stop words are removed when they are not
in lower case?

https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_extraction/_stop_words.py

-- 
Regards,
Peng


More information about the scikit-learn mailing list