<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Is that Jet?!</p>
<p><a class="moz-txt-link-freetext" href="https://www.youtube.com/watch?v=xAoljeRJ3lU">https://www.youtube.com/watch?v=xAoljeRJ3lU</a></p>
<p>;)<br>
</p>
<div class="moz-cite-prefix">On 6/4/18 11:56 AM, Brown J.B. via
scikit-learn wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAJe_vxBppv_1mbU6BTYiA0kJH+GYVucMq1-Kibwpb2uNt6q+Ew@mail.gmail.com">
<div dir="ltr">
<div>Hello community,</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex"><span class="gmail-">
<blockquote class="gmail_quote" style="margin:0px 0px
0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
I wonder if there's something similar for the binary
class case where,<br>
the prediction is a real value (activation) and from
this we can also<br>
derive<br>
- CMs for all prediction cutoff (or set of cutoffs?)<br>
- scores over all cutoffs (AUC, AP, ...)<br>
</blockquote>
</span>
AUC and AP are by definition over all cut-offs. And CMs
for all<br>
cutoffs doesn't seem a good idea, because that'll be
n_samples many<br>
in the general case. If you want to specify a set of
cutoffs, that would be pretty easy to do.<br>
How do you find these cut-offs, though?<span
class="gmail-"><br>
<blockquote class="gmail_quote" style="margin:0px 0px
0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<br>
For me, in analyzing (binary class) performance,
reporting scores for<br>
a single cutoff is less useful than seeing how the
many scores (tpr,<br>
ppv, mcc, relative risk, chi^2, ...) vary at various
false positive<br>
rates, or prediction quantiles.<br>
</blockquote>
</span></blockquote>
<div><br>
</div>
<div>In terms of finding cut-offs, one could use the idea of
metric surfaces that I recently proposed</div>
<div><a
href="https://onlinelibrary.wiley.com/doi/abs/10.1002/minf.201700127"
moz-do-not-send="true">https://onlinelibrary.wiley.com/doi/abs/10.1002/minf.201700127</a>
<br>
</div>
<div>and then plot your per-threshold TPR/TNR pairs on the
PPV/MCC/etc surfaces to determine what conditions you are
willing to accept against the background of your
prediction problem.</div>
<div><br>
</div>
<div>I use these surfaces (a) to think about the prediction
problem before any attempt at modeling is made, and (b) to
deconstruct results such as "Accuracy=85%" into
interpretations in the context of my field and the data
being predicted.</div>
<div><br>
</div>
<div>Hope this contributes a bit of food for thought.</div>
<div>J.B.<br>
</div>
</div>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
scikit-learn mailing list
<a class="moz-txt-link-abbreviated" href="mailto:scikit-learn@python.org">scikit-learn@python.org</a>
<a class="moz-txt-link-freetext" href="https://mail.python.org/mailman/listinfo/scikit-learn">https://mail.python.org/mailman/listinfo/scikit-learn</a>
</pre>
</blockquote>
</body>
</html>