<div dir="ltr">This is not a strongly-held suggestion - but what about adopting YellowBrick as the plotting API for sklearn? Not sure how exactly the interaction would work - could be PRs to their library, or ask them to integrate into sklearn, or do a lock-step dance with versions but maintain separate teams? (I know it raises more questions than answers, but wanted to put it out there.)</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Apr 3, 2019 at 4:07 PM Joel Nothman <<a href="mailto:joel.nothman@gmail.com" target="_blank">joel.nothman@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">With option 1, sklearn.plot is likely to import large chunks of the<br>
library (particularly, but not exclusively, if the plotting function<br>
"does the work" as Andy suggests). This is under the assumption that<br>
one plot function will want to import trees, another GPs, etc. Unless<br>
we move to lazy imports, that would be against the current convention<br>
that importing sklearn is fairly minimal.<br>
<br>
I do like Andy's idea of framing this discussion more clearly around<br>
likely candidates.<br>
<br>
On Thu, 4 Apr 2019 at 00:10, Andreas Mueller <<a href="mailto:t3kcit@gmail.com" target="_blank">t3kcit@gmail.com</a>> wrote:<br>
><br>
> I think what was not clear from the question is that there is actually<br>
> quite different kinds of plotting functions, and many of these are tied<br>
> to existing code.<br>
><br>
> Right now we have some that are specific to trees (plot_tree) and to<br>
> gradient boosting (plot_partial_dependence).<br>
><br>
> I think we want more general functions, and plot_partial_dependence has<br>
> been extended to general estimators.<br>
><br>
> However, the plotting functions might be generic wrt the estimator, but<br>
> relate to a specific function, say plotting results of GridSearchCV.<br>
> Then one might argue that having the plotting function close to<br>
> GridSearchCV might make sense.<br>
> Similarly for plotting partial dependence plots and feature importances,<br>
> it might be a bit strange to have the plotting functions not next to the<br>
> functions that compute these.<br>
> Another question would be is whether the plotting functions also "do the<br>
> work" in some cases:<br>
> Do we want plot_partial_dependence also to compute the partial<br>
> dependence? (I would argue yes but either way the result is a bit strange).<br>
> In that case you have somewhat of the same functionality in two<br>
> different modules, unless you also put the "compute partial dependence"<br>
> function in the plotting module as well,<br>
> which is a bit strange.<br>
><br>
> Maybe we could inform this discussion by listing candidate plotting<br>
> functions, and also considering whether they "do the work" and where the<br>
> "work" function is.<br>
><br>
> Other examples are plotting the confusion matrix, which probably should<br>
> also compute the confusion matrix (it's fast and so that would be<br>
> convenient), and so it would "duplicate" functionality from the metrics<br>
> module.<br>
><br>
> Plotting learning curves and validation curves should probably not do<br>
> the work as it's pretty involved, and so someone would need to import<br>
> the learning and validation curves from model selection, and then the<br>
> plotting functions from a plotting module.<br>
><br>
> Calibrations curves and P/R curves and roc curves are also pretty fast<br>
> to compute (and passing around the arguments is somewhat error prone) so<br>
> I would say the plotting functions for these should do the work as well.<br>
><br>
> Anyway, you can see that many plotting functions are actually associated<br>
> with functions in existing modules and the interactions are a bit unclear.<br>
><br>
> The only plotting functions I haven't mentioned so far that I thought<br>
> about in the past are "2d scatter" and "plot decision function". These<br>
> would be kind of generic, but mostly used in the examples.<br>
> Though having a discrete 2d scatter function would be pretty nice<br>
> (plt.scatter doesn't allow legends and makes it hard to use qualitative<br>
> color maps).<br>
><br>
><br>
> I think I would vote for option (1), "sklearn.plot.plot_zzz" but the<br>
> case is not really that clear.<br>
><br>
> Cheers,<br>
><br>
> Andy<br>
><br>
> On 4/3/19 7:35 AM, Roman Yurchak via scikit-learn wrote:<br>
> > +1 for options 1 and +0.5 for 3. Do we anticipate that many plotting<br>
> > functions will be added? If it's just a dozen or less, putting them all<br>
> > into a single namespace sklearn.plot might be easier.<br>
> ><br>
> > This also would avoid discussion about where to put some generic<br>
> > plotting functions (e.g.<br>
> > <a href="https://github.com/scikit-learn/scikit-learn/issues/13448#issuecomment-478341479" rel="noreferrer" target="_blank">https://github.com/scikit-learn/scikit-learn/issues/13448#issuecomment-478341479</a>).<br>
> ><br>
> > Roman<br>
> ><br>
> > On 03/04/2019 12:06, Trevor Stephens wrote:<br>
> >> I think #1 if any of these... Plotting functions should hopefully be as<br>
> >> general as possible, so tagging with a specific type of estimator will,<br>
> >> in some scikit-learn utopia, be unnecessary.<br>
> >><br>
> >> If a general plotter is built, where does it live in other<br>
> >> estimator-specific namespace options? Feels awkward to put it under<br>
> >> every estimator's namespace.<br>
> >><br>
> >> Then again, there might be a #4 where there is no plot module and<br>
> >> plotting classes live under groups of utilities like introspection,<br>
> >> cross-validation or something?...<br>
> >><br>
> >> On Wed, Apr 3, 2019 at 8:54 PM Andrew Howe <<a href="mailto:ahowe42@gmail.com" target="_blank">ahowe42@gmail.com</a><br>
> >> <mailto:<a href="mailto:ahowe42@gmail.com" target="_blank">ahowe42@gmail.com</a>>> wrote:<br>
> >><br>
> >> My preference would be for (1). I don't think the sub-namespace in<br>
> >> (2) is necessary, and don't like (3), as I would prefer the plotting<br>
> >> functions to be all in the same namespace sklearn.plot.<br>
> >><br>
> >> Andrew<br>
> >><br>
> >> <~~~~~~~~~~~~~~~~~~~~~~~~~~~><br>
> >> J. Andrew Howe, PhD<br>
> >> LinkedIn Profile <<a href="http://www.linkedin.com/in/ahowe42" rel="noreferrer" target="_blank">http://www.linkedin.com/in/ahowe42</a>><br>
> >> ResearchGate Profile <<a href="http://www.researchgate.net/profile/John_Howe12/" rel="noreferrer" target="_blank">http://www.researchgate.net/profile/John_Howe12/</a>><br>
> >> Open Researcher and Contributor ID (ORCID)<br>
> >> <<a href="http://orcid.org/0000-0002-3553-1990" rel="noreferrer" target="_blank">http://orcid.org/0000-0002-3553-1990</a>><br>
> >> Github Profile <<a href="http://github.com/ahowe42" rel="noreferrer" target="_blank">http://github.com/ahowe42</a>><br>
> >> Personal Website <<a href="http://www.andrewhowe.com" rel="noreferrer" target="_blank">http://www.andrewhowe.com</a>><br>
> >> I live to learn, so I can learn to live. - me<br>
> >> <~~~~~~~~~~~~~~~~~~~~~~~~~~~><br>
> >><br>
> >><br>
> >> On Tue, Apr 2, 2019 at 3:40 PM Hanmin Qin <<a href="mailto:qinhanmin2005@sina.com" target="_blank">qinhanmin2005@sina.com</a><br>
> >> <mailto:<a href="mailto:qinhanmin2005@sina.com" target="_blank">qinhanmin2005@sina.com</a>>> wrote:<br>
> >><br>
> >> See <a href="https://github.com/scikit-learn/scikit-learn/issues/13448" rel="noreferrer" target="_blank">https://github.com/scikit-learn/scikit-learn/issues/13448</a><br>
> >><br>
> >> We've introduced several plotting functions (e.g., plot_tree and<br>
> >> plot_partial_dependence) and will introduce more (e.g.,<br>
> >> plot_decision_boundary) in the future. Consequently, we need to<br>
> >> decide where to put these functions. Currently, there're 3<br>
> >> proposals:<br>
> >><br>
> >> (1) sklearn.plot.plot_YYY (e.g., sklearn.plot.plot_tree)<br>
> >><br>
> >> (2) sklearn.plot.XXX.plot_YYY (e.g., sklearn.plot.tree.plot_tree)<br>
> >><br>
> >> (3) sklearn.XXX.plot.plot_YYY (e.g.,<br>
> >> sklearn.tree.plot.plot_tree, note that we won't support from<br>
> >> sklearn.XXX import plot_YYY)<br>
> >><br>
> >> Joel Nothman, Gael Varoquaux and I decided to post it on the<br>
> >> mailing list to invite opinions.<br>
> >><br>
> >> Thanks<br>
> >><br>
> >> Hanmin Qin<br>
> >> _______________________________________________<br>
> >> scikit-learn mailing list<br>
> >> <a href="mailto:scikit-learn@python.org" target="_blank">scikit-learn@python.org</a> <mailto:<a href="mailto:scikit-learn@python.org" target="_blank">scikit-learn@python.org</a>><br>
> >> <a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/mailman/listinfo/scikit-learn</a><br>
> >><br>
> >> _______________________________________________<br>
> >> scikit-learn mailing list<br>
> >> <a href="mailto:scikit-learn@python.org" target="_blank">scikit-learn@python.org</a> <mailto:<a href="mailto:scikit-learn@python.org" target="_blank">scikit-learn@python.org</a>><br>
> >> <a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/mailman/listinfo/scikit-learn</a><br>
> >><br>
> ><br>
> > _______________________________________________<br>
> > scikit-learn mailing list<br>
> > <a href="mailto:scikit-learn@python.org" target="_blank">scikit-learn@python.org</a><br>
> > <a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/mailman/listinfo/scikit-learn</a><br>
> _______________________________________________<br>
> scikit-learn mailing list<br>
> <a href="mailto:scikit-learn@python.org" target="_blank">scikit-learn@python.org</a><br>
> <a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/mailman/listinfo/scikit-learn</a><br>
_______________________________________________<br>
scikit-learn mailing list<br>
<a href="mailto:scikit-learn@python.org" target="_blank">scikit-learn@python.org</a><br>
<a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/mailman/listinfo/scikit-learn</a><br>
</blockquote></div>