<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On 12 Mar 2017, at 18:38, Gael Varoquaux <<a href="mailto:gael.varoquaux@normalesup.org" class="">gael.varoquaux@normalesup.org</a>> wrote:</div><div class=""><div class=""><br class="">You can use sample weights to go a bit in this direction. But in general,<br class="">the mathematical meaning of your intuitions will depend on the<br class="">classifier, so they will not be general ways of implementing them without<br class="">a lot of tinkering.<br class=""></div></div></blockquote></div><br class=""><div class="">I see… to be honest for my purposes it would be enough to bypass the target binarization for</div><div class="">the MLP classifier, so maybe I will just fork my own copy of that class for this.</div><div class=""><br class=""></div><div class="">The purpose is two-fold,  on the one hand use the probabilities generated by a very complex </div><div class="">model (e.g. a massive ensemble) to train a simpler one that achieves comparable performance at a</div><div class="">fraction of the cost. Any universal classifier will do (neural networks are the prime example).</div><div class=""><br class=""></div><div class="">The second purpose it to use classes probabilities instead of observed classes at training time.</div><div class="">In some problems this helps with model regularization (see section 6 of  [1])</div><div class=""><br class=""></div><div class="">Cheers,</div><div class="">J</div><div class=""><br class=""></div><div class="">[1] <a href="https://arxiv.org/pdf/1503.02531v1.pdf" class="">https://arxiv.org/pdf/1503.02531v1.pdf</a></div><div class=""><br class=""></div><div class=""><br class=""></div></body></html>