[scikit-learn] Sprint discussion points?

Andreas Mueller t3kcit at gmail.com
Wed Feb 13 20:54:54 EST 2019


Do you have a reference for the logistic regression stability? Is it 
convergence warnings?

Happy to discuss the other two issues, though I feel they seem easier 
than most of what's on my list.

I have no idea what's going on with OPTICS tbh, and I'll leave it up to 
you and the others to decide whether that's something we should discuss.
I can try to read up and weigh in but that might not be the most 
effective way to do it.

the sample props is something I left out because I personally don't feel 
it's a priority compared to all the other things;
my students have basically no way to figure out what features the 
coefficients in their linear model correspond to, that seems a bit more 
important to me.

We can put it on the discussion list again, but I'm not super 
enthusiastic about it.

How should we prioritize things?


On 2/13/19 8:08 PM, Joel Nothman wrote:
> Yes, I was thinking the same. I think there are some other core issues 
> to solve, such as:
>
> * euclidean_distances numerical issues
> * commitment to ARM testing and debugging
> * logistic regression stability
>
> We should also nut out OPTICS issues or remove it from 0.21. I'm still 
> keen on trying to work out sample props (supporting weighted scoring 
> at least), but perhaps I'm being persuaded this will never be a 
> top-priority requirement, and the solutions add much complexity.
>
> On Thu, 14 Feb 2019 at 07:39, Andreas Mueller <t3kcit at gmail.com 
> <mailto:t3kcit at gmail.com>> wrote:
>
>     Hey all.
>
>     Should we collect some discussion points for the sprint?
>
>     There's an unusual amount of core-devs present and I think we
>     should seize the opportunity.
>     Maybe we should create a page in the wiki or add it to the sprint
>     page?
>
>     Things that are high on my list of priorities are:
>
>       * slicing pipelines
>       * add get_feature_names to pipelines
>       * freezing estimator
>       * faster multi-metric scoring
>       * fit_transform doing something other than fit.transform
>       * imbalance-learn interface / subsampling in pipelines
>       * Specifying search spaces and valid hyper parameters
>         (https://github.com/scikit-learn/scikit-learn/issues/13031).
>       * allowing EstimatorCV-style speed-up in GridSearches
>       * storing pandas column names and using them as feature names
>
>
>     Trying to discuss all of these might be too much, but maybe we can
>     figure out a subset and make sure we have sleps to discuss?
>     Most of these issues are on the roadmap, issue 13031 is reladed to
>     #18 but not directly on the roadmap.
>
>     Thanks,
>     Andy
>     _______________________________________________
>     scikit-learn mailing list
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>     https://mail.python.org/mailman/listinfo/scikit-learn
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190213/ae914076/attachment-0001.html>


More information about the scikit-learn mailing list