[scikit-learn] How does the random state influence the decision tree splits?

Javier López jlopez at ende.cc
Sat Oct 27 19:16:39 EDT 2018

Hi Sebastian,

I think the random state is used to select the features that go into each
split (look at the `max_features` parameter)


On Sun, Oct 28, 2018 at 12:07 AM Sebastian Raschka <
mail at sebastianraschka.com> wrote:

> Hi all,
> when I was implementing a bagging classifier based on scikit-learn's
> DecisionTreeClassifier, I noticed that the results were not deterministic
> and found that this was due to the random_state in the
> DescisionTreeClassifier (which is set to None by default).
> I am wondering what exactly this random state is used for? I can imaging
> it being used for resolving ties if the information gain for multiple
> features is the same, or it could be that the feature splits of continuous
> features is different? (I thought the heuristic is to sort the features and
> to consider those feature values next to each associated with examples that
> have different class labels -- but is there maybe some random subselection
> involved?)
> If someone knows more about this, where the random_state is used, I'd be
> happy to hear it :)
> Also, we could then maybe add the info to the DecisionTreeClassifier's
> docstring, which is currently a bit too generic to be useful, I think:
> https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/tree.py
>     random_state : int, RandomState instance or None, optional
> (default=None)
>         If int, random_state is the seed used by the random number
> generator;
>         If RandomState instance, random_state is the random number
> generator;
>         If None, the random number generator is the RandomState instance
> used
>         by `np.random`.
> Best,
> Sebastian
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20181028/b095b973/attachment.html>

More information about the scikit-learn mailing list