[scikit-learn] Random Forest with Bootstrapping
Ibrahim Dalal
cs14btech11041 at iith.ac.in
Mon Oct 3 14:25:51 EDT 2016
Dear Developers,
>From whatever little knowledge I gained last night about Random Forests,
each tree is trained with a sub-sample of original dataset (usually with
replacement)?.
(Note: Please do correct me if I am not making any sense.)
RandomForestClassifier has an option of 'bootstrap'. The API states the
following
> The sub-sample size is always the same as the original input sample size
> but the samples are drawn with replacement if bootstrap=True (default).
Now, what I am not able to understand is - if entire dataset is used to
train each of the trees, then how does the classifier estimates the OOB
error? None of the entries of the dataset is an oob for any of the trees.
(Pardon me if all this sounds BS)
Help this mere mortal.
Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20161003/4c8e5e8c/attachment.html>
More information about the scikit-learn
mailing list