[scikit-learn] help-Renaming features in Sckit-learn's CountVectorizer()
Ranjana Girish
ranjanagirish30 at gmail.com
Mon Mar 5 09:18:26 EST 2018
Hai all,
I have a very large pandas dataframe. Below is the sample
* Id description*
1 switvch for air conditioner transformer..............
2 control tfrmr...........
3 coling pad.................
4 DRLG machine
5 hair smothing kit...............
For further process, I will contruct doument-term matrix of above data
using Sckit-learn's countvectorizer
*countvec = CountVectorizer()*
*documenttermmatrix=countvec.fit_transform( dataset['description'])*
I have to correct misspelled features in description. Replacing wrongly
spelled word with correctly spelled word for large dataset is taking so
much of time.
So i thought of correcting features using features list in count
vectorizer given by code
*features_names= **countvec.get_feature_names()*
*Is it possible to rename features using above list and further use it for
classification process???*
Thanks
Ranjana
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20180305/7ca1f2c0/attachment.html>
More information about the scikit-learn
mailing list