[scikit-learn] GSoC 2017 : "Parallel Decision Tree Building"

Aman Pratik amanpratik10 at gmail.com
Wed Mar 22 03:55:45 EDT 2017


Hello Developers,

This is Aman Pratik. I am currently pursuing my B.Tech from Indian
Institute of Technology, Varanasi. After doing some research I have found
some material on Decision Trees and Parallelization. Hence, I propose my
first draft for the project "Parallel Decision Tree Building" for GSoC 2017.

Proposal : First Draft
<https://github.com/amanp10/scikit-learn/wiki/GSoC-2017-:-Parallel-Decision-Tree-Building>

Why me?

I have been working in Python for the past 2 years and have good idea about
Machine Learning algorithms. I am quite familiar with scikit-learn both as
a user and a developer.

These are the issues/PRs I have worked/working on for the past few months.

[MRG+1] Issue#5803 : Regression Test added #8112
<https://github.com/scikit-learn/scikit-learn/pull/8112>

[MRG] Issue#6673:Make a wrapper around functions that score an individual
feature #8038 <https://github.com/scikit-learn/scikit-learn/pull/8038>

[MRG] Issue #7987: Embarrassingly parallel "n_restarts_optimizer" in
GaussianProcessRegressor #7997
<https://github.com/scikit-learn/scikit-learn/pull/7997>

My GitHub Profile: amanp10 <https://www.github.com/amanp10>

I have worked with parallelization in one of my PR, so I am not new to it.
I have used cython a couple of times, though as a beginner. I have not used
Decision Tree much but I am familiar with the theory and algorithm. Also, I
am familiar with Benchmark tests, Unit tests and other technical knowledge
I would require for this project.

Meanwhile, I have started my study for the subject and gaining experience
with Cython. I am looking forward to guidance from the potential mentors or
anyone willing to help.

Thank You
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170322/0bda3084/attachment-0001.html>


More information about the scikit-learn mailing list