[scikit-learn] Need Corresponding indices array of values in each split of a DesicisionTreeClassifier

Nixon Raj nixnmtm at gmail.com
Wed Feb 8 04:43:17 EST 2017


Hi Joel andJeff

Thanks for your valuable comment, i got that to work

On 8 February 2017 at 08:13, Jeff Blackburne <jblackburne at gmail.com> wrote:

> Nixon,
>
> If you are using version 0.18 or later, you can reconstruct the
> information you need using the `decision_path` method:
>
> http://scikit-learn.org/stable/auto_examples/tree/
> plot_unveil_tree_structure.html
>
> -Jeff
>
>
> On Tue, Feb 7, 2017 at 3:21 PM, Joel Nothman <joel.nothman at gmail.com>
> wrote:
>
>> I don't think putting that array of indices in a visualisation is a great
>> idea!
>>
>> If you use my_tree.apply(X) you will be able to determine which leaf each
>> instance in X lands up at, and potentially trace up the tree from there.
>>
>> On 8 February 2017 at 01:26, Nixon Raj <nixnmtm at gmail.com> wrote:
>>
>>>
>>> For Example, In the below decision tree dot file, I have 223 samples
>>> which splits into [174, 49] in the first split and [110, 1] in the 2nd split
>>>
>>> I would like to get the array of indices for the values of each split
>>> like
>>>
>>> *[174, 49] and their corresponding indices (idx)  like [[0, 1 ,5,
>>> 7,....,200,221], [3, 4, 6, ....., 199,222,223]]*
>>>
>>> *[110, 1] and their corresponding indices (idx) like [[0,5,....200,221],
>>> [7]]*
>>>
>>> Please help me
>>>
>>> node [shape=box] ;
>>> 0 [label="X[0] <= 13.9191\nentropy = 0.7597\nsamples = 223\nvalue =
>>> [174, 49]"] ;
>>> 1 [label="X[1] <= 3.1973\nentropy = 0.0741\nsamples = 111\nvalue = [110,
>>> 1]"] ;
>>> 0 -> 1 [labeldistance=2.5, labelangle=45, headlabel="True"] ;
>>> 2 [label="entropy = 0.0\nsamples = 109\nvalue = [109, 0]"] ;
>>> 1 -> 2 ;
>>> 3 [label="entropy = 1.0\nsamples = 2\nvalue = [1, 1]"] ;
>>> 1 -> 3 ;
>>> 4 [label="X[1] <= 3.1266\nentropy = 0.9852\nsamples = 112\nvalue = [64,
>>> 48]"] ;
>>> 0 -> 4 [labeldistance=2.5, labelangle=-45, headlabel="False"] ;
>>> 5 [label="X[2] <= -0.4882\nentropy = 0.7919\nsamples = 63\nvalue = [48,
>>> 15]"] ;
>>> 4 -> 5 ;
>>> 6 [label="entropy = 0.684\nsamples = 11\nvalue = [2, 9]"] ;
>>> 5 -> 6 ;
>>> 7 [label="X[2] <= 0.5422\nentropy = 0.5159\nsamples = 52\nvalue = [46,
>>> 6]"] ;
>>> 5 -> 7 ;
>>> 8 [label="entropy = 0.0\nsamples = 18\nvalue = [18, 0]"] ;
>>> 7 -> 8 ;
>>> 9 [label="X[2] <= 0.6497\nentropy = 0.6723\nsamples = 34\nvalue = [28,
>>> 6]"] ;
>>> 7 -> 9 ;
>>> 10 [label="entropy = 0.0\nsamples = 1\nvalue = [0, 1]"] ;
>>> 9 -> 10 ;
>>> 11 [label="X[2] <= 1.887\nentropy = 0.6136\nsamples = 33\nvalue = [28,
>>> 5]"] ;
>>> 9 -> 11 ;
>>> 12 [label="entropy = 0.0\nsamples = 12\nvalue = [12, 0]"] ;
>>> 11 -> 12 ;
>>> 13 [label="X[2] <= 2.6691\nentropy = 0.7919\nsamples = 21\nvalue = [16,
>>> 5]"] ;
>>> 11 -> 13 ;
>>> 14 [label="entropy = 0.8113\nsamples = 4\nvalue = [1, 3]"] ;
>>> 13 -> 14 ;
>>> 15 [label="entropy = 0.5226\nsamples = 17\nvalue = [15, 2]"] ;
>>> 13 -> 15 ;
>>> 16 [label="X[0] <= 17.3284\nentropy = 0.9113\nsamples = 49\nvalue = [16,
>>> 33]"] ;
>>> 4 -> 16 ;
>>> 17 [label="entropy = 0.9183\nsamples = 6\nvalue = [4, 2]"] ;
>>> 16 -> 17 ;
>>> 18 [label="X[2] <= 19.7048\nentropy = 0.8542\nsamples = 43\nvalue = [12,
>>> 31]"] ;
>>> 16 -> 18 ;
>>> 19 [label="X[2] <= 5.8511\nentropy = 0.8296\nsamples = 42\nvalue = [11,
>>> 31]"] ;
>>> 18 -> 19 ;
>>> 20 [label="X[0] <= 31.8916\nentropy = 0.878\nsamples = 37\nvalue = [11,
>>> 26]"] ;
>>> 19 -> 20 ;
>>> 21 [label="X[1] <= 3.3612\nentropy = 0.6666\nsamples = 23\nvalue = [4,
>>> 19]"] ;
>>> 20 -> 21 ;
>>> 22 [label="entropy = 0.8905\nsamples = 13\nvalue = [4, 9]"] ;
>>> 21 -> 22 ;
>>> 23 [label="entropy = 0.0\nsamples = 10\nvalue = [0, 10]"] ;
>>> 21 -> 23 ;
>>> 24 [label="entropy = 1.0\nsamples = 14\nvalue = [7, 7]"] ;
>>> 20 -> 24 ;
>>> 25 [label="entropy = 0.0\nsamples = 5\nvalue = [0, 5]"] ;
>>> 19 -> 25 ;
>>> 26 [label="entropy = 0.0\nsamples = 1\nvalue = [1, 0]"] ;
>>> 18 -> 26 ;
>>> }
>>>
>>>
>>> _______________________________________________
>>> scikit-learn mailing list
>>> scikit-learn at python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>
>>>
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>


-- 
Regards

Nixon Raj N
Department of Biological Science and Technology
Institute of Bioinformatics and Systems Biology
National Chiao Tung University
208 Lab Building 1, 75 Bo-Ai St.
Dong District, Hsinchu, Taiwan 30062
(R.O.C.)
Mob:+886-989353921
0ffice ext: 56997
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170208/0e0103c6/attachment.html>


More information about the scikit-learn mailing list