[scikit-learn] is RandomForest random samples or random features?

斌洪 hongbinopen at gmail.com
Tue Sep 13 09:34:08 EDT 2016


thanks to all of you. I think I have got the point.  ^_^

2016-09-13 20:30 GMT+08:00 Dale T Smith <Dale.T.Smith at macys.com>:

> Wrong! Apologies, I had a double loop in there.
>
>
>
> Get a random sample of the training data
>
> For I to n_estimators:
>
>                 Build a tree – this involves a *random sample of features*
> and thresholds for each feature in the training data sample at each node.
>
>                 Use the rest of the training data, not in the sample, to
> calculate the out-of-bag score.
>
>
>
> I also edited a bit for clarity. Refer to Gilles Loope’s dissertation for
> details.
>
>
>
> ____________________________________________________________
> ______________________________
> *Dale Smith* | Macy's Systems and Technology | IFS eCommerce | Data
> Science
> 770-658-5176 | 5985 State Bridge Road, Johns Creek, GA 30097 |
> dale.t.smith at macys.com
>
>
>
> *From:* scikit-learn [mailto:scikit-learn-bounces+dale.t.smith=
> macys.com at python.org] *On Behalf Of *Dale T Smith
> *Sent:* Tuesday, September 13, 2016 8:24 AM
> *To:* Scikit-learn user and developer mailing list
> *Subject:* Re: [scikit-learn] is RandomForest random samples or random
> features?
>
>
>
> ⚠ EXT MSG:
>
> Each tree is built using a random sample with replacement from the
> provided training data. The data not in the sample is used to calculate the
> out-of-bag score. The “bag” is the sampled data.
>
>
>
> The “random” refers to several features of the algorithm, including random
> sampling of features
>
>
>
> So for each tree
>
>                 Get a random sample of the training data
>
>                 For I to n_estimators:
>
>                                 Build a tree – this involves a *random
> sample of features* and thresholds for each feature in the sample at each
> node.
>
>                                 Use the rest of the training data, not in
> the sample, to calculate the out-of-bag score
>
>
>
> Random Forest already incorporates “random features”.
>
>
>
> https://github.com/glouppe/phd-thesis
>
>
>
> ____________________________________________________________
> ______________________________
> *Dale Smith* | Macy's Systems and Technology | IFS eCommerce | Data
> Science
> 770-658-5176 | 5985 State Bridge Road, Johns Creek, GA 30097 |
> dale.t.smith at macys.com
>
>
>
> *From:* scikit-learn [mailto:scikit-learn-bounces+
> dale.t.smith=macys.com at python.org
> <scikit-learn-bounces+dale.t.smith=macys.com at python.org>] *On Behalf Of *
> ??
> *Sent:* Tuesday, September 13, 2016 4:16 AM
> *To:* scikit-learn at python.org
> *Subject:* [scikit-learn] is RandomForest random samples or random
> features?
>
>
>
> ⚠ EXT MSG:
>
> I have read the Guide of sklearn's RandomForest :
>
> """
> In random forests (see RandomForestClassifier
> <http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier>
> and RandomForestRegressor
> <http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html#sklearn.ensemble.RandomForestRegressor>
> classes), each tree in the ensemble is built from a sample drawn with
> replacement (i.e., a bootstrap sample) from the training set.
> """
>
> But I prefer RandomForest as :
> """
> features ("attributes", "predictors", "independent variables") are
> randomly sampled
> """
>
> is RandomForest random samples or random features? where can I find a
> features random version of RandomForest?
>
> thx.
>
> * This is an EXTERNAL EMAIL. Stop and think before clicking a link or
> opening attachments.
>
> * This is an EXTERNAL EMAIL. Stop and think before clicking a link or
> opening attachments.
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20160913/2b0b9329/attachment.html>


More information about the scikit-learn mailing list