[scikit-learn] How to get the most important features from a RF efficiently
drraph at gmail.com
Thu Jul 21 11:22:09 EDT 2016
I have a set of feature vectors associated with binary class labels,
each of which has about 40,000 features. I can train a random forest
classifier in sklearn which works well. I would however like to see
the most important features.
I tried simply printing out forest.feature_importances_ but this takes
about 1 second per feature making about 40,000 seconds overall. This
is much much longer than the time needed to train the classifier in
the first place?
Is there a more efficient way to find out which features are most important?
On 21 July 2016 at 15:58, Nelson Liu <nfliu at uw.edu> wrote:
> If I remember correctly, scikit-learn.org is hosted on GitHub Pages (so the
> maintainers don't have control over downtime and issues like the one you're
> having). Can you connect to GitHub, or any site on GitHub Pages?
> On Thu, Jul 21, 2016, 07:52 Rahul Ahuja <rahul.ahuja at live.com> wrote:
>> Hi there,
>> Sklearn website has been down for couple of days. Please look into it.
>> I reside in Pakistan, Karachi city.
>> Kind regards,
>> Rahul Ahuja
>> scikit-learn mailing list
>> scikit-learn at python.org
> scikit-learn mailing list
> scikit-learn at python.org
More information about the scikit-learn