<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p>Hi Andreas,</p>
<br>
<div class="moz-cite-prefix">On Tuesday 23 January 2018 09:12 PM,
Gaurav Dhingra wrote:<br>
</div>
<blockquote type="cite"
cite="mid:4be189c1-5e9b-8421-b12d-8648833ce921@gmail.com">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<p><br>
</p>
<div class="moz-forward-container"><br>
<br>
-------- Forwarded Message --------
<table class="moz-email-headers-table" cellspacing="0"
cellpadding="0" border="0">
<tbody>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">Subject:
</th>
<td>Re: [scikit-learn] Topic for thesis work on scikit
learn</td>
</tr>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">Date:
</th>
<td>Tue, 23 Jan 2018 10:16:36 -0500</td>
</tr>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">From:
</th>
<td>Andreas Mueller <a class="moz-txt-link-rfc2396E"
href="mailto:t3kcit@gmail.com" moz-do-not-send="true"><t3kcit@gmail.com></a></td>
</tr>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">To: </th>
<td>Gaurav Dhingra <a class="moz-txt-link-rfc2396E"
href="mailto:gauravdhingra.gxyd@gmail.com"
moz-do-not-send="true"><gauravdhingra.gxyd@gmail.com></a></td>
</tr>
</tbody>
</table>
<br>
<br>
<meta http-equiv="Content-Type" content="text/html;
charset=utf-8">
<p>Hi Gaurav.</p>
<p>Is your mentor experienced in contributing to sklearn?</p>
</div>
</blockquote>
<br>
No, she isn't.<br>
<br>
<blockquote type="cite"
cite="mid:4be189c1-5e9b-8421-b12d-8648833ce921@gmail.com">
<div class="moz-forward-container">
<p>Will they be able to review your code to the scikit-learn
standards?<br>
</p>
</div>
</blockquote>
<br>
No.<br>
<br>
<blockquote type="cite"
cite="mid:4be189c1-5e9b-8421-b12d-8648833ce921@gmail.com">
<div class="moz-forward-container">
<p> </p>
<p>Have you worked on any other pull requests so far?</p>
</div>
</blockquote>
<br>
I've on a few. Please have a look at <a moz-do-not-send="true"
href="https://github.com/scikit-learn/scikit-learn/pulls/gxyd">https://github.com/scikit-learn/scikit-learn/pulls/gxyd</a>,
infact I expect that 3 of the open PR's will be merged soon.<br>
<br>
<blockquote type="cite"
cite="mid:4be189c1-5e9b-8421-b12d-8648833ce921@gmail.com">
<div class="moz-forward-container">
<p>Getting anything into scikit-learn without close
collaboration with the community is quite tricky.</p>
</div>
</blockquote>
<blockquote type="cite"
cite="mid:4be189c1-5e9b-8421-b12d-8648833ce921@gmail.com">
<div class="moz-forward-container">
<p>Having a faster K-means implementation based on recent
research in the area would be interesting,<br>
There's also interest in adding Robust PCA, probabilistic
inference trees, and improving the latent dirichlet
alloctation code.</p>
</div>
</blockquote>
<br>
I tried to look into what <i>scikit-learn community</i><i>/</i><i>devs</i>
consider a priority to have in their code-base (instead of me
looking explicitly for topics I like). When I looked, I thought of <a
moz-do-not-send="true"
href="https://github.com/scikit-learn/scikit-learn/issues/8337">https://github.com/scikit-learn/scikit-learn/issues/8337</a>,
or <a moz-do-not-send="true"
href="https://github.com/scikit-learn/scikit-learn/issues/6557">https://github.com/scikit-learn/scikit-learn/issues/6557</a>
as the possible topics. But since I'm aware that unavailability of
yours (busy in teaching purpose can be an issue), so I
simultaneously looked for other options. I'd a conversation with
Joel (he was kind enough to PM me), this is what he said (only the
important part of conversation):<br>
<br>
| Tricky thinngs we’ve been trying to do for years:<br>
| * estimator tags<br>
| * sample props<br>
| tools for optimising cluster parameters (e.g. #6948)<br>
| sample props == #4497 and associated<br>
| related to clusterer parameters, #6160<br>
| estimator tags relates to #6715<br>
| #6777 looks tricky from an ML perspective.<br>
<br>
I'm thinking of choosing <a moz-do-not-send="true"
href="https://github.com/scikit-learn/scikit-learn/pull/6948">https://github.com/scikit-learn/scikit-learn/pull/6948</a>
(ENH optimal n_clusters value), i.e completing that PR. If you will
be having availability to review my PR's (if I do open them), then
I'd glad to work with you on either <i>Conditional inference trees
</i>or <i>adding post-pruning for decision trees</i>. <br>
<br>
I'm aware as Joel earlier put it <i>Andreas has escaped into the
teaching world</i>. Anyways, I don't expect my guide to provide me
feedback in regards to scikit-learn code, though she will have
theoretical explanation to my questions definitely. Also, since we
can also have a co-guide (apart from local guide), I would
definitely consider that as an option for someone from scikit-learn,
even if it be you or may be Joel. But even Joel is expected to get
back to academic world as well.<br>
<br>
If things don't go a little positive (neither you or Joel or may be
someone else from scikit-learn community is available), I'm gonna be
taking a little longer but I'll eventually get there probably.<br>
<br>
<blockquote type="cite"
cite="mid:4be189c1-5e9b-8421-b12d-8648833ce921@gmail.com">
<div class="moz-forward-container">
<p>You can find issues on any of these in the issue tracker,
which also has many more feature requests.</p>
<p>Andy<br>
</p>
<br>
<div class="moz-cite-prefix">On 12/31/2017 05:46 AM, Gaurav
Dhingra wrote:<br>
</div>
<blockquote type="cite"
cite="mid:3769babd-294d-b3cb-6ae5-3552e5f6db1e@gmail.com">
<meta http-equiv="Content-Type" content="text/html;
charset=utf-8">
<p>Hi Andreas,</p>
<p>I think I'll get access to a local mentor from my college,
so I think I rule that issue out, though for technicalities
still I would <i>like</i> to be more dependent on feedback
from the scikit-learn community, since my aim wouldn't be to
make something for my own use but rather something that
would be more useful for the scikit-learn community, so that
it eventually gets merged into master.</p>
<p>I'm currently looking for topic that I can take up, I tried
looking into scikit-learn wiki but it doesn't mention for
what I'm looking for (no topic is mentioned). Do you have
some topic in mind that could be useful for addition to
scikit-learn? Even if you could direct me to appropriate
links I would be happy to look into those.<br>
</p>
<br>
<div class="moz-cite-prefix">On Wednesday 01 November 2017
01:43 AM, Andreas Mueller wrote:<br>
</div>
<blockquote type="cite"
cite="mid:9641a578-194f-183c-fa2c-22cb45a7c76d@gmail.com">Hi
Gaurav. <br>
<br>
Do you have a local mentor? I think having a mentor that can
guide you during a thesis is very important. <br>
You could get some feedback from the community for a
contribution, but that can be slow, <br>
and is entirely on volunteer basis, so there is no guarantee
that you'll get the necessary feedback in time <br>
to finish your thesis. <br>
<br>
Mentoring a thesis - in particular without knowing you - is
a serious commitment, so I'm not sure someone <br>
from inside the project will want to do this. I saw you
already made a contribution in <a
class="moz-txt-link-freetext"
href="https://github.com/scikit-learn/scikit-learn/pull/10005"
moz-do-not-send="true">https://github.com/scikit-learn/scikit-learn/pull/10005</a>
<br>
but that's a very different scope than doing what I expect
would be several month of work.</blockquote>
<br>
Though in this regard I've made a few more contributions, here
is the link <a moz-do-not-send="true"
href="https://github.com/scikit-learn/scikit-learn/pulls/gxyd">https://github.com/scikit-learn/scikit-learn/pulls/gxyd</a>,
though I know none of them is a big contribution. If you think
I should work on a big enough PR, can you please suggest me
some issue in that regard?<br>
<br>
Thanks.<br>
<br>
<blockquote type="cite"
cite="mid:9641a578-194f-183c-fa2c-22cb45a7c76d@gmail.com"> <br>
<br>
Best, <br>
Andy <br>
<br>
On 10/31/2017 03:31 PM, Gaurav Dhingra wrote: <br>
<blockquote type="cite">Hi everyone, <br>
<br>
I am a final year (5th year) undergraduate Applied
Mathematics student in India. I am thinking of doing my
final year thesis by doing some work (coding part) on
scikit learn, so I was thinking if anyone could tell me if
there are available topics (not necessarily names of those
topics) that I could work on being an undergraduate
student? I would want to expand upon this in December when
my exams will be over. But in the mean time would want to
take a step in that direction by just knowing if there
will be available topics that I could work on. <br>
<br>
It could be the case that available topics are not so easy
for an undergraduate, still in that case I would like to
do some research on the topics first. <br>
<br>
</blockquote>
<br>
_______________________________________________ <br>
scikit-learn mailing list <br>
<a class="moz-txt-link-abbreviated"
href="mailto:scikit-learn@python.org"
moz-do-not-send="true">scikit-learn@python.org</a> <br>
<a class="moz-txt-link-freetext"
href="https://mail.python.org/mailman/listinfo/scikit-learn"
moz-do-not-send="true">https://mail.python.org/mailman/listinfo/scikit-learn</a>
<br>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Gaurav Dhingra
(sent from Thunderbird email client)
</pre>
</blockquote>
<br>
</div>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Gaurav Dhingra
(sent from Thunderbird email client)
</pre>
</body>
</html>