GSoC, 2017 - Parallel Decision Tree Building
Hello everyone, I am a pre-final year student studying Electronics & Communication Engineering at IIT Guwahati. I am a member of Prof. Amit Sethi <http://www.iitg.ernet.in/amitsethi/teaching.html>'s research group where I work on cancer recurrence prediction using deep learning and have also started working with Prof. Ashish Anand <http://Prof.%20Ashish%20Anand>, using NLP for genome sequencing. I want to contribute to scikit-learn working on the project 'Parallel Decision Tree Building' for GSoC, 2017. I have been contributing to scikit-learn for the past few weeks working on issues across different modules. Although I am familiar with the tree building algorithms, I have not worked a lot on the tree module of scikit-learn and hence, am I trying to familiarize myself by working on these issues: https://github.com/scikit-learn/scikit-learn/issues/4225 https://github.com/scikit-learn/scikit-learn/issues/6557 Please let me know as to what should be the next steps that I need to follow for building a good proposal. Thank you, Aman Dalmia, Pre-final year student, Electronics & Communication Engineering, IIT Guwahati, +91-8011492025
Hi Aman I responded to your other email, but I'm not sure if it actually went through. Thanks for your interest in the project, and your current PRs. If you're looking to apply, you should write a gist which follows the format that nelson-liu used here: https://github.com/scikit-learn/scikit-learn/ wiki/GSoC-2016-Proposal:-Addition-of-various-enhancements-to-the-tree- module-by-completing-stalled-pull-requests. The goal of this project is to parallelize the building of single decision trees, likely by parallelizing the task of finding the optimal split at each node. You should put as much detail in as possible for this proposal. As Gael mentioned in the other thread, the limiting factor for GSoC this year is mentor time, and the most successful students will be those who can operate independently. A detailed proposal outlining exactly what needs to be done will go a long way in showing us that you understand the problem and the codebase well enough to set achievable goals for the summer. In addition, we want to ensure that you have the requisite background in python, cython, parallel processing, and tree building required for the project, so you should emphasize those skills and previous work you've done which utilize them. Let me know if you have any further questions, and I look forward to seeing your proposal! Jacob On Tue, Feb 28, 2017 at 5:06 AM, Aman Dalmia <amandalmia18@gmail.com> wrote:
Hello everyone,
I am a pre-final year student studying Electronics & Communication Engineering at IIT Guwahati. I am a member of Prof. Amit Sethi <http://www.iitg.ernet.in/amitsethi/teaching.html>'s research group where I work on cancer recurrence prediction using deep learning and have also started working with Prof. Ashish Anand <http://Prof.%20Ashish%20Anand>, using NLP for genome sequencing. I want to contribute to scikit-learn working on the project 'Parallel Decision Tree Building' for GSoC, 2017. I have been contributing to scikit-learn for the past few weeks working on issues across different modules. Although I am familiar with the tree building algorithms, I have not worked a lot on the tree module of scikit-learn and hence, am I trying to familiarize myself by working on these issues:
https://github.com/scikit-learn/scikit-learn/issues/4225 https://github.com/scikit-learn/scikit-learn/issues/6557
Please let me know as to what should be the next steps that I need to follow for building a good proposal.
Thank you, Aman Dalmia, Pre-final year student, Electronics & Communication Engineering, IIT Guwahati, +91-8011492025 <+91%2080114%2092025>
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
participants (2)
-
Aman Dalmia -
Jacob Schreiber