Re: [scikit-learn] scikit-learn Digest, Vol 12, Issue 18
On 11 Mar 2017 22:32, <scikit-learn-request@python.org> wrote:
Send scikit-learn mailing list submissions to scikit-learn@python.org
To subscribe or unsubscribe via the World Wide Web, visit https://mail.python.org/mailman/listinfo/scikit-learn or, via email, send a message with subject or body 'help' to scikit-learn-request@python.org
You can reach the person managing the list at scikit-learn-owner@python.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of scikit-learn digest..."
Today's Topics:
1. Label encoding for classifiers and soft targets (Javier L?pez Pe?a) 2. issue suggestion - decision trees - GSoC (Konstantinos Katrioplas)
----------------------------------------------------------------------
Message: 1 Date: Sat, 11 Mar 2017 13:04:57 +0000 From: Javier L?pez Pe?a <jlopez@ende.cc> To: scikit-learn@python.org Subject: [scikit-learn] Label encoding for classifiers and soft targets Message-ID: <542B0BDD-F329-4F26-9001-9F535426306C@ende.cc> Content-Type: text/plain; charset=utf-8
Hi there!
I have been recently experimenting with model regularization through the use of soft targets, and I?d like to be able to play with that from sklearn.
The main idea is as follows: imagine I want to fit a (probabilisitic) classifier with three possible targets, 0, 1, 2
If I pass my training set (X, y) to a sklearn classifier, the target vector y gets encoded so that each target becomes an array, [1, 0, 0], [0, 1, 0], or [0, 0, 1]
What I would like to do is to be able to pass the targets directly in the encoded form, and avoid any further encoding. This allows for instance to pass targets as [0.9, 0.5, 0.5] if I want to prevent my classifier from becoming too opinionated on its predicted probabilities.
Ideally I would like to do something like this: ``` clf = SomeClassifier(*parameters, encode_targets=False) ```
and then call ``` elf.fit(X, encoded_y) ```
Would it be simple to modify sklearn code to do this, or would it require a lot of tinkering such as modifying every single classifier under the sun?
Cheers, J
------------------------------
Message: 2 Date: Sat, 11 Mar 2017 15:29:30 +0200 From: Konstantinos Katrioplas <konst.katrioplas@gmail.com> To: scikit-learn@python.org Subject: [scikit-learn] issue suggestion - decision trees - GSoC Message-ID: <33a3a5bf-37dd-1cad-c4ae-ef4b62294a8c@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed
Hello all,
While I am waiting for the PR that I have submitted to be evaluated (https://github.com/scikit-learn/scikit-learn/pull/8563), would you suggest another (easy) issue for me to work on? Ideally something for which I will write some substantial code, so as to present it in my application for GSoC?
Is anyone interested to mentor me in the parallelization of decision trees? I admit I am not yet really familiar with the current tree code (although I have been using the method for regression on a research project) but I am very much intrigued by the idea and willing to learn all about it until the summer.
Regards, Konstantinos
------------------------------
Subject: Digest Footer
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
------------------------------
End of scikit-learn Digest, Vol 12, Issue 18 ********************************************
participants (1)
-
Gautam Borad