<div dir="auto"></div><div class="gmail_extra"><br><div class="gmail_quote">On 11 Mar 2017 22:32,  <<a href="mailto:scikit-learn-request@python.org">scikit-learn-request@python.org</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Send scikit-learn mailing list submissions to<br>

        <a href="mailto:scikit-learn@python.org">scikit-learn@python.org</a><br>

<br>

To subscribe or unsubscribe via the World Wide Web, visit<br>

        <a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/scikit-learn</a><br>

or, via email, send a message with subject or body 'help' to<br>

        <a href="mailto:scikit-learn-request@python.org">scikit-learn-request@python.<wbr>org</a><br>

<br>

You can reach the person managing the list at<br>

        <a href="mailto:scikit-learn-owner@python.org">scikit-learn-owner@python.org</a><br>

<br>

When replying, please edit your Subject line so it is more specific<br>

than "Re: Contents of scikit-learn digest..."<br>

<br>

<br>

Today's Topics:<br>

<br>

   1. Label encoding for classifiers and soft targets<br>

      (Javier L?pez Pe?a)<br>

   2. issue suggestion - decision trees - GSoC (Konstantinos Katrioplas)<br>

<br>

<br>

------------------------------<wbr>------------------------------<wbr>----------<br>

<br>

Message: 1<br>

Date: Sat, 11 Mar 2017 13:04:57 +0000<br>

From: Javier L?pez Pe?a <jlopez@ende.cc><br>

To: <a href="mailto:scikit-learn@python.org">scikit-learn@python.org</a><br>

Subject: [scikit-learn] Label encoding for classifiers and soft<br>

        targets<br>

Message-ID: <542B0BDD-F329-4F26-9001-<wbr>9F535426306C@ende.cc><br>

Content-Type: text/plain; charset=utf-8<br>

<br>

Hi there!<br>

<br>

I have been recently experimenting with model regularization through the use of soft targets,<br>

and I?d like to be able to play with that from sklearn.<br>

<br>

The main idea is as follows: imagine I want to fit a (probabilisitic) classifier with three possible<br>

targets, 0, 1, 2<br>

<br>

If I pass my training set (X, y) to a sklearn classifier, the target vector y gets encoded so that<br>

each target becomes an array, [1, 0, 0], [0, 1, 0], or [0, 0, 1]<br>

<br>

What I would like to do is to be able to pass the targets directly in the encoded form, and avoid<br>

any further encoding. This allows for instance to pass targets as [0.9, 0.5, 0.5] if I want to prevent<br>

my classifier from becoming too opinionated on its predicted probabilities.<br>

<br>

Ideally I would like to do something like this:<br>

```<br>

clf = SomeClassifier(*parameters, encode_targets=False)<br>

```<br>

<br>

and then call<br>

```<br>

elf.fit(X, encoded_y)<br>

```<br>

<br>

Would it be simple to modify sklearn code to do this, or would it require a lot of tinkering<br>

such as modifying every single classifier under the sun?<br>

<br>

Cheers,<br>

J<br>

<br>

------------------------------<br>

<br>

Message: 2<br>

Date: Sat, 11 Mar 2017 15:29:30 +0200<br>

From: Konstantinos Katrioplas <<a href="mailto:konst.katrioplas@gmail.com">konst.katrioplas@gmail.com</a>><br>

To: <a href="mailto:scikit-learn@python.org">scikit-learn@python.org</a><br>

Subject: [scikit-learn] issue suggestion - decision trees - GSoC<br>

Message-ID: <<a href="mailto:33a3a5bf-37dd-1cad-c4ae-ef4b62294a8c@gmail.com">33a3a5bf-37dd-1cad-c4ae-<wbr>ef4b62294a8c@gmail.com</a>><br>

Content-Type: text/plain; charset=utf-8; format=flowed<br>

<br>

Hello all,<br>

<br>

While I am waiting for the PR that I have submitted to be evaluated<br>

(<a href="https://github.com/scikit-learn/scikit-learn/pull/8563" rel="noreferrer" target="_blank">https://github.com/scikit-<wbr>learn/scikit-learn/pull/8563</a>), would you<br>

suggest another (easy) issue for me to work on? Ideally something for<br>

which I will write some substantial code, so as to present it in my<br>

application for GSoC?<br>

<br>

Is anyone interested to mentor me in the parallelization of decision<br>

trees? I admit I am not yet really familiar with the current tree code<br>

(although I have been using the method for regression on a research<br>

project) but I am very much intrigued by the idea and willing to learn<br>

all about it until the summer.<br>

<br>

Regards,<br>

Konstantinos<br>

<br>

<br>

------------------------------<br>

<br>

Subject: Digest Footer<br>

<br>

______________________________<wbr>_________________<br>

scikit-learn mailing list<br>

<a href="mailto:scikit-learn@python.org">scikit-learn@python.org</a><br>

<a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/scikit-learn</a><br>

<br>

<br>

------------------------------<br>

<br>

End of scikit-learn Digest, Vol 12, Issue 18<br>

******************************<wbr>**************<br>

</blockquote></div></div>