[scikit-learn] is RandomForest random samples or random features?
Dale T Smith
Dale.T.Smith at macys.com
Tue Sep 13 08:30:02 EDT 2016
Wrong! Apologies, I had a double loop in there.
Get a random sample of the training data
For I to n_estimators:
Build a tree – this involves a random sample of features and thresholds for each feature in the training data sample at each node.
Use the rest of the training data, not in the sample, to calculate the out-of-bag score.
I also edited a bit for clarity. Refer to Gilles Loope’s dissertation for details.
__________________________________________________________________________________________
Dale Smith | Macy's Systems and Technology | IFS eCommerce | Data Science
770-658-5176 | 5985 State Bridge Road, Johns Creek, GA 30097 | dale.t.smith at macys.com
From: scikit-learn [mailto:scikit-learn-bounces+dale.t.smith=macys.com at python.org] On Behalf Of Dale T Smith
Sent: Tuesday, September 13, 2016 8:24 AM
To: Scikit-learn user and developer mailing list
Subject: Re: [scikit-learn] is RandomForest random samples or random features?
⚠ EXT MSG:
Each tree is built using a random sample with replacement from the provided training data. The data not in the sample is used to calculate the out-of-bag score. The “bag” is the sampled data.
The “random” refers to several features of the algorithm, including random sampling of features
So for each tree
Get a random sample of the training data
For I to n_estimators:
Build a tree – this involves a random sample of features and thresholds for each feature in the sample at each node.
Use the rest of the training data, not in the sample, to calculate the out-of-bag score
Random Forest already incorporates “random features”.
https://github.com/glouppe/phd-thesis
__________________________________________________________________________________________
Dale Smith | Macy's Systems and Technology | IFS eCommerce | Data Science
770-658-5176 | 5985 State Bridge Road, Johns Creek, GA 30097 | dale.t.smith at macys.com<mailto:dale.t.smith at macys.com>
From: scikit-learn [mailto:scikit-learn-bounces+dale.t.smith=macys.com at python.org] On Behalf Of ??
Sent: Tuesday, September 13, 2016 4:16 AM
To: scikit-learn at python.org<mailto:scikit-learn at python.org>
Subject: [scikit-learn] is RandomForest random samples or random features?
⚠ EXT MSG:
I have read the Guide of sklearn's RandomForest :
"""
In random forests (see RandomForestClassifier<http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier> and RandomForestRegressor<http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html#sklearn.ensemble.RandomForestRegressor> classes), each tree in the ensemble is built from a sample drawn with replacement (i.e., a bootstrap sample) from the training set.
"""
But I prefer RandomForest as :
"""
features ("attributes", "predictors", "independent variables") are randomly sampled
"""
is RandomForest random samples or random features? where can I find a features random version of RandomForest?
thx.
* This is an EXTERNAL EMAIL. Stop and think before clicking a link or opening attachments.
* This is an EXTERNAL EMAIL. Stop and think before clicking a link or opening attachments.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20160913/b0b0e475/attachment.html>
More information about the scikit-learn
mailing list