<div dir="ltr">thanks to all of you. I think I have got the point.  ^_^<br></div><div class="gmail_extra"><br><div class="gmail_quote">2016-09-13 20:30 GMT+08:00 Dale T Smith <span dir="ltr"><<a href="mailto:Dale.T.Smith@macys.com" target="_blank">Dale.T.Smith@macys.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">





<div link="blue" vlink="purple" lang="EN-US">
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Wrong! Apologies, I had a double loop in there.<u></u><u></u></span></p><span class="">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Get a random sample of the training data<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">For I to n_estimators:<u></u><u></u></span></p>
</span><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">                Build a tree – this involves a
<b>random sample of features</b> and thresholds for each feature in the training data sample at each node.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">                Use the rest of the training data, not in the sample, to calculate the out-of-bag score.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">I also edited a bit for clarity. Refer to Gilles Loope’s dissertation for details.<u></u><u></u></span></p><span class="">
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif";color:red;background:white"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif";color:red;background:white">______________________________<wbr>______________________________<wbr>______________________________</span><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#212121"><br>
</span><b><span style="font-size:10.0pt;font-family:"Arial","sans-serif";color:#1f497d;background:white">Dale Smith</span></b><span style="font-size:10.0pt;font-family:"Arial","sans-serif";color:#1f497d;background:white"> | Macy's Systems and Technology | IFS
 eCommerce | Data Science<br>
</span><span style="font-size:10.0pt;font-family:"Arial","sans-serif";color:#1f497d">770-658-5176 | 5985 State Bridge Road, Johns Creek, GA 30097 | <a href="mailto:dale.t.smith@macys.com" target="_blank">dale.t.smith@macys.com</a><u></u><u></u></span></p>
</div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
</span><div>
<div style="border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> scikit-learn [mailto:<a href="mailto:scikit-learn-bounces%2Bdale.t.smith" target="_blank">scikit-learn-bounces+<wbr>dale.t.smith</a>=<a href="mailto:macys.com@python.org" target="_blank">macys.com@python.<wbr>org</a>]
<b>On Behalf Of </b>Dale T Smith<br>
<b>Sent:</b> Tuesday, September 13, 2016 8:24 AM<br>
<b>To:</b> Scikit-learn user and developer mailing list<br>
<b>Subject:</b> Re: [scikit-learn] is RandomForest random samples or random features?<u></u><u></u></span></p>
</div>
</div><div><div class="h5">
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<p class="MsoNormal"><span style="color:red">⚠ EXT MSG:</span> <u></u><u></u></p>
</div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Each tree is built using a random sample with replacement from the provided training data. The data not in the sample is used to calculate the out-of-bag score.
 The “bag” is the sampled data.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">The “random” refers to several features of the algorithm, including random sampling of features<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">So for each tree<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">                Get a random sample of the training data<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">                For I to n_estimators:<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">                                Build a tree – this involves a
<b>random sample of features</b> and thresholds for each feature in the sample at each node.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">                                Use the rest of the training data, not in the sample, to calculate the out-of-bag score<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Random Forest already incorporates “random features”.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><a href="https://github.com/glouppe/phd-thesis" target="_blank">https://github.com/glouppe/<wbr>phd-thesis</a><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif";color:red;background:white"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif";color:red;background:white">______________________________<wbr>______________________________<wbr>______________________________</span><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#212121"><br>
</span><b><span style="font-size:10.0pt;font-family:"Arial","sans-serif";color:#1f497d;background:white">Dale Smith</span></b><span style="font-size:10.0pt;font-family:"Arial","sans-serif";color:#1f497d;background:white"> | Macy's Systems and Technology | IFS
 eCommerce | Data Science<br>
</span><span style="font-size:10.0pt;font-family:"Arial","sans-serif";color:#1f497d">770-658-5176 | 5985 State Bridge Road, Johns Creek, GA 30097 | <a href="mailto:dale.t.smith@macys.com" target="_blank">dale.t.smith@macys.com</a><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> scikit-learn [<a href="mailto:scikit-learn-bounces+dale.t.smith=macys.com@python.org" target="_blank">mailto:scikit-learn-bounces+<wbr>dale.t.smith=macys.com@python.<wbr>org</a>]
<b>On Behalf Of </b>??<br>
<b>Sent:</b> Tuesday, September 13, 2016 4:16 AM<br>
<b>To:</b> <a href="mailto:scikit-learn@python.org" target="_blank">scikit-learn@python.org</a><br>
<b>Subject:</b> [scikit-learn] is RandomForest random samples or random features?<u></u><u></u></span></p>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<p class="MsoNormal"><span style="color:red">⚠ EXT MSG:</span> <u></u><u></u></p>
</div>
<div>
<div>
<div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt">I have read the Guide of sklearn's RandomForest :<br>
<br>
"""<br>
In random forests (see <a href="http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier" title="sklearn.ensemble.RandomForestClassifier" target="_blank">
<span><span style="font-size:10.0pt;font-family:"Courier New"">RandomForestClassifier</span></span></a> and
<a href="http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html#sklearn.ensemble.RandomForestRegressor" title="sklearn.ensemble.RandomForestRegressor" target="_blank">
<span><span style="font-size:10.0pt;font-family:"Courier New"">RandomForestRegressor</span></span></a> classes), each tree in the ensemble is built from a sample drawn with replacement (i.e., a bootstrap sample) from the training set.<br>
"""<u></u><u></u></p>
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt">But I prefer RandomForest as :<br>
"""<br>
features ("attributes", "predictors", "independent variables") are randomly sampled<br>
"""<u></u><u></u></p>
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt">is RandomForest random samples or random features? where can I find a features random version of RandomForest?<u></u><u></u></p>
</div>
<p class="MsoNormal">thx.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="color:red">* This is an EXTERNAL EMAIL. Stop and think before clicking a link or opening attachments.</span>
<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="color:red">* This is an EXTERNAL EMAIL. Stop and think before clicking a link or opening attachments.</span>
<u></u><u></u></p>
</div>
</div></div></div>
</div>

<br>______________________________<wbr>_________________<br>
scikit-learn mailing list<br>
<a href="mailto:scikit-learn@python.org">scikit-learn@python.org</a><br>
<a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/scikit-learn</a><br>
<br></blockquote></div><br></div>