[scikit-learn] Random forest fitting very well

muhammad waseem m.waseem.ahmad at gmail.com
Thu Jun 23 06:28:17 EDT 2016


Hi Brian,
Thanks for your email, I did try
tree.export_graphviz(model,out_file='tree.dot'),but I got an error saying
AttributeError: 'RandomForestRegressor' object has no attribute 'tree_' which
I think is because this is a forest, not a single tree that's why I can't
visualise it, No?

Also, do you have any comments on the results that I got with default
values?

Regards
Waseem


On Thu, Jun 23, 2016 at 11:05 AM, Brian Holt <bdholt1 at gmail.com> wrote:

> Hi Muhammad,
>
> If you've not yet read the documentation I would highly recommend starting
> with the Decision Tree [1] and working your way through the examples on
> your own data.  You'll find an example [2] of how to generate a graphviz
> compatible dot file and visualise it.
>
> Once your satisfied that you understand what each tree is doing with your
> dataset as you vary parameters, then it makes sense to try to inject some
> randomness by varying the features used in each tree or the samples (or
> both [3]).
>
> Regards,
> Brian
>
> [1] http://scikit-learn.org/stable/modules/tree.html
> [2]
> http://scikit-learn.org/stable/modules/generated/sklearn.tree.export_graphviz.html#sklearn.tree.export_graphviz
> [3]
> http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.ExtraTreesClassifier.html
>
> On 23 June 2016 at 10:20, muhammad waseem <m.waseem.ahmad at gmail.com>
> wrote:
>
>> Hi All,
>> I am trying to use random forests for a regression problem, with 10 input
>> variables and one output variable. I am getting very good fit even with
>> default parameters and low n_estimators. Even with n_estimator = 10, I get
>> R^2 value of 0.95 on testing dataset (MSE=23) and a value of 0.99 for
>> the training set. I was wondering, if this is common with random forest or
>> I am missing something, Could you please share your experience? The total
>> number of sample (training +testing) are equal to 10971.
>> Also, what are the most important parameters (max_depth, bootstrap,
>> max_leaf_nodes etc.) that I need to play with to tune my model even
>> further? Lastly, is there is a way I can visualise a single tree of my
>> forest (just for demonstration purposes)?
>> Please see a figure below to demonstrate how well it is fitting with
>> default values.
>>
>>
>>
>> [image: Inline image 1]
>> Thanks
>> Kindest Regards
>> Waseem
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20160623/f0eb7d79/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: forest fitting.png
Type: image/png
Size: 86146 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20160623/f0eb7d79/attachment-0001.png>


More information about the scikit-learn mailing list