<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<br>
<br>
<div class="moz-cite-prefix">On 2/14/20 5:47 PM, Paul Chike Ofoche
via scikit-learn wrote:<br>
</div>
<blockquote type="cite"
cite="mid:1478783340.5521193.1581731226647@mail.yahoo.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<!--[if gte mso 9]><xml><o:OfficeDocumentSettings><o:AllowPNG/><o:PixelsPerInch>96</o:PixelsPerInch></o:OfficeDocumentSettings></xml><![endif]-->
<div class="ydp59a7a7a7yahoo-style-wrap" style="font-family:
Helvetica Neue,Helvetica,Arial,sans-serif; font-size: 13px;"> </div>
<div class="ydp59a7a7a7yahoo-style-wrap" style="font-family:
Helvetica Neue,Helvetica,Arial,sans-serif; font-size: 13px;"
dir="ltr" data-setdir="false">
<p style="text-align:justify">Many thanks Nicolas and Andreas.</p>
<br>
<br>
</div>
<div class="ydp59a7a7a7yahoo-style-wrap" style="font-family:
Helvetica Neue,Helvetica,Arial,sans-serif; font-size: 13px;"
dir="ltr" data-setdir="false">
</div>
<p class="ydp59a7a7a7yahoo-style-wrap" style="text-align:justify"
dir="ltr" data-setdir="false">I was wondering whether this
multioutput
handling capability of the RandomForestRegressor has been added
recently. In order to verify, I went on a fact-finding mission
by re-running the exact same codes I had in 2018 and noticed
quite a number of
changes. I guess that many moons have passed since then!</p>
<div class="ydp59a7a7a7yahoo-style-wrap" style="font-family:
Helvetica Neue,Helvetica,Arial,sans-serif; font-size: 13px;"
dir="ltr" data-setdir="false">
</div>
<p class="ydp59a7a7a7yahoo-style-wrap" style="text-align:justify"
dir="ltr" data-setdir="false">For instance, <span
style="background-image: none; background-repeat: repeat;
background-attachment: scroll; background-size: auto;
background-color: yellow;">sklearn.cross_validation</span>
has been deprecated since when last I used it in 2018 (and
replaced by <span style="background-image: none;
background-repeat: repeat; background-attachment: scroll;
background-size: auto; background-color: yellow;">sklearn.model_selection</span>).
Also, such errors as:</p>
<p class="ydp59a7a7a7yahoo-style-wrap" style="text-align:justify"
dir="ltr" data-setdir="false">i. <span style="background-image:
none; background-repeat: repeat; background-attachment:
scroll; background-size: auto; background-color: yellow;">ValueError:
Expected 2D array, got scalar array instead:</span></p>
<div class="ydp59a7a7a7yahoo-style-wrap" style="font-family:
Helvetica Neue,Helvetica,Arial,sans-serif; font-size: 13px;"
dir="ltr" data-setdir="false">
</div>
<p class="ydp59a7a7a7yahoo-style-wrap" style="text-align:justify"
dir="ltr" data-setdir="false"><span style="background-image:
none; background-repeat: repeat; background-attachment:
scroll; background-size: auto; background-color: yellow;">array=6.5.</span></p>
<div class="ydp59a7a7a7yahoo-style-wrap" style="font-family:
Helvetica Neue,Helvetica,Arial,sans-serif; font-size: 13px;"
dir="ltr" data-setdir="false">
</div>
<p class="ydp59a7a7a7yahoo-style-wrap" style="text-align:justify"
dir="ltr" data-setdir="false"><span style="background-image:
none; background-repeat: repeat; background-attachment:
scroll; background-size: auto; background-color: yellow;">Reshape
your data either using array.reshape(-1, 1) if
your data has a single feature or array.reshape(1, -1) if it
contains a single
sample.</span></p>
<div class="ydp59a7a7a7yahoo-style-wrap" style="font-family:
Helvetica Neue,Helvetica,Arial,sans-serif; font-size: 13px;"
dir="ltr" data-setdir="false">
</div>
<p class="ydp59a7a7a7yahoo-style-wrap" style="text-align:justify"
dir="ltr" data-setdir="false">and</p>
<p class="ydp59a7a7a7yahoo-style-wrap" style="text-align:justify"
dir="ltr" data-setdir="false">ii. <span
style="background-image: none; background-repeat: repeat;
background-attachment: scroll; background-size: auto;
background-color: yellow;">DataConversionWarning: A
column-vector y was passed when
a 1d array was expected. Please change the shape of y to
(n_samples,), for
example using ravel().</span></p>
</blockquote>
All of these were errors in 2018 already, you might not have had the
most up-to-date version then ;)<br>
cross_validation was deprecated in 2016:<br>
<a
href="https://scikit-learn.org/dev/whats_new/v0.18.html#version-0-18">https://scikit-learn.org/dev/whats_new/v0.18.html#version-0-18</a><br>
<br>
<blockquote type="cite"
cite="mid:1478783340.5521193.1581731226647@mail.yahoo.com">
<div class="ydp59a7a7a7yahoo-style-wrap" style="font-family:
Helvetica Neue,Helvetica,Arial,sans-serif; font-size: 13px;"
dir="ltr" data-setdir="false">
</div>
<p class="ydp59a7a7a7yahoo-style-wrap" style="text-align:justify"
dir="ltr" data-setdir="false">when passing a <b>scalar</b> and
a <b>column-vector y</b> respectively are entirely new from
when last I made use of
Python’s RandomForestRegressor. Previously, they worked just
fine without
throwing out any errors. I know that the “multioutputs” were
handled back in 2018
(I actually tested this capability back then), but I assumed
that the
regressors were fit per target i.e. that there was no
correlation between
targets.</p>
</blockquote>
I can't find a changelog entry but pretty sure this goes back to
2014 or so. Definitely it was present in 2018.<br>
<blockquote type="cite"
cite="mid:1478783340.5521193.1581731226647@mail.yahoo.com">
<div class="ydp59a7a7a7yahoo-style-wrap" style="font-family:
Helvetica Neue,Helvetica,Arial,sans-serif; font-size: 13px;"
dir="ltr" data-setdir="false">
</div>
<p class="ydp59a7a7a7yahoo-style-wrap" style="text-align:justify"
dir="ltr" data-setdir="false">Today, for comparison, I generated
some random target outputs (three columns) and using the same <b>random_state</b>,
I ran
the all-inclusive multioutput prediction (with all three output
targets
simultaneously vs. re-running each output prediction one at a
time). The results are different, implying that some form of
correlation takes place amongst the multioutput targets, when
predicted
together. (For completeness, I display the first 28 predicted
output
values, from the multioutput prediction as well as the single
output predictions.</p>
<span
style="font-size:11.0pt;line-height:107%;font-family:"Calibri",sans-serif;mso-ascii-theme-font:minor-latin;mso-fareast-font-family:Calibri;mso-fareast-theme-font:
minor-latin;mso-hansi-theme-font:minor-latin;mso-bidi-font-family:"Times
New Roman";
mso-bidi-theme-font:minor-bidi;mso-ansi-language:EN-US;mso-fareast-language:
EN-US;mso-bidi-language:AR-SA"><span
style="font-size:11.0pt;line-height:107%;
font-family:"Calibri",sans-serif;mso-ascii-theme-font:minor-latin;mso-fareast-font-family:
Calibri;mso-fareast-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New
Roman";mso-bidi-theme-font:minor-bidi;
mso-ansi-language:EN-US;mso-fareast-language:EN-US;mso-bidi-language:AR-SA"></span></span><br>
<span
style="font-size:11.0pt;line-height:107%;font-family:"Calibri",sans-serif;mso-ascii-theme-font:minor-latin;mso-fareast-font-family:Calibri;mso-fareast-theme-font:
minor-latin;mso-hansi-theme-font:minor-latin;mso-bidi-font-family:"Times
New Roman";
mso-bidi-theme-font:minor-bidi;mso-ansi-language:EN-US;mso-fareast-language:
EN-US;mso-bidi-language:AR-SA"><span
style="font-size:11.0pt;line-height:107%;
font-family:"Calibri",sans-serif;mso-ascii-theme-font:minor-latin;mso-fareast-font-family:
Calibri;mso-fareast-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New
Roman";mso-bidi-theme-font:minor-bidi;
mso-ansi-language:EN-US;mso-fareast-language:EN-US;mso-bidi-language:AR-SA"></span></span>
<div class="ydp59a7a7a7yahoo-style-wrap" style="font-family:
Helvetica Neue,Helvetica,Arial,sans-serif; font-size: 13px;"
dir="ltr" data-setdir="false"><span
style="font-size:11.0pt;line-height:107%;font-family:"Calibri",sans-serif;mso-ascii-theme-font:minor-latin;mso-fareast-font-family:Calibri;mso-fareast-theme-font:
minor-latin;mso-hansi-theme-font:minor-latin;mso-bidi-font-family:"Times
New Roman";
mso-bidi-theme-font:minor-bidi;mso-ansi-language:EN-US;mso-fareast-language:
EN-US;mso-bidi-language:AR-SA"><span
style="font-size:11.0pt;line-height:107%;
font-family:"Calibri",sans-serif;mso-ascii-theme-font:minor-latin;mso-fareast-font-family:
Calibri;mso-fareast-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New
Roman";mso-bidi-theme-font:minor-bidi;
mso-ansi-language:EN-US;mso-fareast-language:EN-US;mso-bidi-language:AR-SA"><span><br>
</span></span></span><font size="3"
face=""Calibri",sans-serif"><br>
</font></div>
<div class="ydp59a7a7a7yahoo-style-wrap" style="font-family:
Helvetica Neue,Helvetica,Arial,sans-serif; font-size: 13px;"
dir="ltr" data-setdir="false"><font size="3"
face=""Calibri",sans-serif">For my </font><font
size="3" face=""Calibri",sans-serif">knowledge’s
sake, could you
please inform me about the technique being employed now to
take advantage of
the correlations between targets? Is it the Mahalanobis
distance or some other
metric? In other words, could you please give me a hint as to
the underlying
reason why the single output predictions differ from the
multioutput
predictions? I am curious to know as this would finally fully
quench my appetite
after nearly two years. I will have to retrace my steps and
get back to the good old Python ways (again). Thank you.</font></div>
<br>
</blockquote>
<font size="3"><font face=""Calibri",sans-serif">It
doesn't explicitly use the correlation. The splitting criterion
is is the sum over the splitting criteria over the outputs. That
means there's an implicit regularization as the tree is shared
between the targets.</font></font><br>
</body>
</html>