[scikit-learn] obtaining intervals from the decision tree struture

Guillaume Lemaître g.lemaitre58 at gmail.com
Tue Mar 7 10:41:47 EST 2023


Hi Sole,

You can use `apply` on the training `X` to get the leaf where the sample will fall in. Then a groupby should allow you to get the statistic that you want.

Cheers,
--
Guillaume Lemaitre
Scikit-learn @ Inria Foundation
https://glemaitre.github.io/

> On 7 Mar 2023, at 15:53, Sole Galli via scikit-learn <scikit-learn at python.org> wrote:
> 
> Hello,
> 
> I would like to obtain final intervals from the decision tree structure. I am not interested in every node, just the limits that take a sample to a final decision /leaf.
> 
> For example, if the tree structure is this one:
> |--- feature_0 <= 0.08
> |   |--- class: 0
> |--- feature_0 >  0.08
> |   |--- feature_0 <= 8.50
> |   |   |--- feature_0 <= 1.50
> |   |   |   |--- class: 1
> |   |   |--- feature_0 >  1.50
> |   |   |   |--- class: 1
> |   |--- feature_0 >  8.50
> |   |   |--- feature_0 <= 60.25
> |   |   |   |--- class: 0
> |   |   |--- feature_0 >  60.25
> |   |   |   |--- class: 0
> Then, I would like to obtain these limits:
> 0-0.08 ; 0.08-1.50; 1.50-8.50 ; 8.50-60; >60
> 
> Potentially as the following numpy array:
> [-np.inf, 0.08, 1.5, 8.5, 60, np.inf]
> 
> Is it possible?
> 
> I have a stackoverflow question here for more details and code
> https://stackoverflow.com/questions/75663472/how-to-obtain-the-interval-limits-from-a-decision-tree-with-scikit-learn
> 
> Thank you!
> Sole
> 
> Sent with Proton Mail <https://proton.me/> secure email.
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20230307/38caab02/attachment.html>


More information about the scikit-learn mailing list