<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p>Hi Joel,</p>
<p>Thanks a lot for the answer.<br>
</p>
<p>"Each train/test split in cross_val_score holds out test data.
GridSearchCV then splits each train set into (inner-)train and
validation sets. "</p>
<p>I know this is what nested CV supposed to do but the code is
doing an excellent job at obscuring this. I'll try and add some
clarification in as comments later today.</p>
<p>Cheers,</p>
<p>d<br>
</p>
<p><br>
</p>
<div class="moz-cite-prefix">On 29/11/16 00:07, Joel Nothman wrote:<br>
</div>
<blockquote
cite="mid:CAAkaFLXyUQ1qRwzQR5OZz-=yv3-=VwcS5MZOxFRt1s=WUT78XQ@mail.gmail.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<div dir="ltr">If that clarifies, please offer changes to the
example (as a pull request) that make this clearer.</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On 29 November 2016 at 11:06, Joel
Nothman <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:joel.nothman@gmail.com" target="_blank">joel.nothman@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">Briefly:
<div><br>
</div>
<div>
<pre style="padding:5px 10px;font-family:monaco,menlo,consolas,"courier new",monospace;font-size:13px;border-radius:4px;margin-top:0.1em;margin-bottom:0.5em;line-height:1.2em;word-break:break-all;word-wrap:break-word;white-space:pre-wrap;background-color:rgb(248,248,248);border:1px solid rgb(221,221,221);overflow-x:auto;overflow-y:hidden"><span class="m_-1140303242554994897gmail-n">clf</span> <span class="m_-1140303242554994897gmail-o" style="color:rgb(102,102,102)">=</span> <a moz-do-not-send="true" href="http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html#sklearn.model_selection.GridSearchCV" style="color:rgb(40,120,162);word-wrap:break-word" target="_blank"><span class="m_-1140303242554994897gmail-n">GridSearchCV</span></a><span class="m_-1140303242554994897gmail-p">(</span><span class="m_-1140303242554994897gmail-n">estimator</span><span class="m_-1140303242554994897gmail-o" style="color:rgb(102,102,102)">=</span><span class="m_-1140303242554994897gmail-n">svr</span><span class="m_-1140303242554994897gmail-p">,</span> <span class="m_-1140303242554994897gmail-n">param_grid</span><span class="m_-1140303242554994897gmail-o" style="color:rgb(102,102,102)">=</span><span class="m_-1140303242554994897gmail-n">p_grid</span><span class="m_-1140303242554994897gmail-p">,</span> <span class="m_-1140303242554994897gmail-n">cv</span><span class="m_-1140303242554994897gmail-o" style="color:rgb(102,102,102)">=</span><span class="m_-1140303242554994897gmail-n">inner_cv</span><span class="m_-1140303242554994897gmail-p">)</span>
<span class="m_-1140303242554994897gmail-n">nested_score</span> <span class="m_-1140303242554994897gmail-o" style="color:rgb(102,102,102)">=</span> <a moz-do-not-send="true" href="http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.cross_val_score.html#sklearn.model_selection.cross_val_score" style="color:rgb(40,120,162);word-wrap:break-word" target="_blank"><span class="m_-1140303242554994897gmail-n">cross_val_score</span></a><span class="m_-1140303242554994897gmail-p">(</span><span class="m_-1140303242554994897gmail-n">clf</span><span class="m_-1140303242554994897gmail-p">,</span> <span class="m_-1140303242554994897gmail-n">X</span><span class="m_-1140303242554994897gmail-o" style="color:rgb(102,102,102)">=</span><span class="m_-1140303242554994897gmail-n">X_iris</span><span class="m_-1140303242554994897gmail-p">,</span> <span class="m_-1140303242554994897gmail-n">y</span><span class="m_-1140303242554994897gmail-o" style="color:rgb(102,102,102)">=</span><span class="m_-1140303242554994897gmail-n">y_iris</span><span class="m_-1140303242554994897gmail-p">,</span> <span class="m_-1140303242554994897gmail-n">cv</span><span class="m_-1140303242554994897gmail-o" style="color:rgb(102,102,102)">=</span><span class="m_-1140303242554994897gmail-n">outer_cv</span><span class="m_-1140303242554994897gmail-p">)</span></pre>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">Each train/test split in
cross_val_score holds out test data. GridSearchCV then
splits each train set into (inner-)train and
validation sets. There is no leakage of test set
knowledge from the outer loop into the grid search
optimisation; no leakage of validation set knowledge
into the SVR optimisation. The outer test data are
reused as training data, but within each split are
only used to measure generalisation error.</div>
<div class="gmail_quote"><br>
Is that clear?</div>
<div class="gmail_quote"><br>
</div>
<div class="gmail_quote">
<div>
<div class="h5">On 29 November 2016 at 10:30, Daniel
Homola <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:dani.homola@gmail.com"
target="_blank">dani.homola@gmail.com</a>></span>
wrote:<br>
</div>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
<div class="h5">
<div dir="ltr">
<div
class="m_-1140303242554994897m_-6276298983508018080gmail_signature">
<div>
<p
style="margin-top:0px;margin-bottom:0px;color:rgb(0,0,0);font-family:calibri,arial,helvetica,sans-serif;font-size:16px">Dear
all,</p>
<p
style="margin-top:0px;margin-bottom:0px;color:rgb(0,0,0);font-family:calibri,arial,helvetica,sans-serif;font-size:16px"><br>
</p>
<p
style="margin-top:0px;margin-bottom:0px;color:rgb(0,0,0);font-family:calibri,arial,helvetica,sans-serif;font-size:16px">I
was wondering if the following example
code is valid:</p>
<p
style="margin-top:0px;margin-bottom:0px;color:rgb(0,0,0);font-family:calibri,arial,helvetica,sans-serif;font-size:16px"><a
moz-do-not-send="true"
href="http://scikit-learn.org/stable/auto_examples/model_selection/plot_nested_cross_validation_iris.html"
class="m_-1140303242554994897m_-6276298983508018080gmail-m_8935104698084405240gmail-x_OWAAutoLink"
id="m_-1140303242554994897m_-6276298983508018080gmail-m_8935104698084405240gmail-LPlnk436896"
target="_blank">http://scikit-learn.org/stable<wbr>/auto_examples/model_selection<wbr>/plot_nested_cross_validation_<wbr>iris.html</a><br>
<br>
</p>
<p
style="margin-top:0px;margin-bottom:0px;color:rgb(0,0,0);font-family:calibri,arial,helvetica,sans-serif;font-size:16px">My
understanding is, that the point of
nested cross-validation is to prevent
any data leakage from the
inner grid-search/param optimization CV
loop into the outer model evaluation CV
loop. This could be achieved if the
outer CV loop's test data is completely
separated from the inner loop's CV, as
shown here:</p>
<p
style="margin-top:0px;margin-bottom:0px;color:rgb(0,0,0);font-family:calibri,arial,helvetica,sans-serif;font-size:16px"><a
moz-do-not-send="true"
href="https://mlr-org.github.io/mlr-tutorial/release/html/img/nested_resampling.png"
class="m_-1140303242554994897m_-6276298983508018080gmail-m_8935104698084405240gmail-x_OWAAutoLink"
id="m_-1140303242554994897m_-6276298983508018080gmail-m_8935104698084405240gmail-LPlnk683151"
target="_blank">https://mlr-org.github.io/mlr-<wbr>tutorial/release/html/img/nest<wbr>ed_resampling.png</a></p>
<p
style="margin-top:0px;margin-bottom:0px;color:rgb(0,0,0);font-family:calibri,arial,helvetica,sans-serif;font-size:16px"><span
style="font-family:calibri,arial,helvetica,sans-serif,"apple color
emoji","segoe ui
emoji",notocoloremoji,"segoe
ui symbol","android
emoji",emojisymbols"><br>
</span></p>
<p
style="margin-top:0px;margin-bottom:0px;color:rgb(0,0,0);font-family:calibri,arial,helvetica,sans-serif;font-size:16px"><span
style="font-family:calibri,arial,helvetica,sans-serif,"apple color
emoji","segoe ui
emoji",notocoloremoji,"segoe
ui symbol","android
emoji",emojisymbols">The code in
the above example however doesn't seem
to achieve this in any way.</span><br>
</p>
<p
style="margin-top:0px;margin-bottom:0px;color:rgb(0,0,0);font-family:calibri,arial,helvetica,sans-serif;font-size:16px"><br>
</p>
<p
style="margin-top:0px;margin-bottom:0px;color:rgb(0,0,0);font-family:calibri,arial,helvetica,sans-serif;font-size:16px">Am
I missing something here? </p>
<p
style="margin-top:0px;margin-bottom:0px;color:rgb(0,0,0);font-family:calibri,arial,helvetica,sans-serif;font-size:16px"><br>
</p>
<p
style="margin-top:0px;margin-bottom:0px;color:rgb(0,0,0);font-family:calibri,arial,helvetica,sans-serif;font-size:16px">Thanks
a lot,</p>
<p
style="margin-top:0px;margin-bottom:0px;color:rgb(0,0,0);font-family:calibri,arial,helvetica,sans-serif;font-size:16px">dh</p>
</div>
</div>
</div>
<br>
</div>
</div>
______________________________<wbr>_________________<br>
scikit-learn mailing list<br>
<a moz-do-not-send="true"
href="mailto:scikit-learn@python.org"
target="_blank">scikit-learn@python.org</a><br>
<a moz-do-not-send="true"
href="https://mail.python.org/mailman/listinfo/scikit-learn"
rel="noreferrer" target="_blank">https://mail.python.org/mailma<wbr>n/listinfo/scikit-learn</a><br>
<br>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
scikit-learn mailing list
<a class="moz-txt-link-abbreviated" href="mailto:scikit-learn@python.org">scikit-learn@python.org</a>
<a class="moz-txt-link-freetext" href="https://mail.python.org/mailman/listinfo/scikit-learn">https://mail.python.org/mailman/listinfo/scikit-learn</a>
</pre>
</blockquote>
<br>
</body>
</html>