[scikit-learn] Vote on SLEP010: n_features_in_ attribute

Andreas Mueller t3kcit at gmail.com
Wed Dec 4 11:05:57 EST 2019

On 12/4/19 5:05 AM, Trevor Stephens wrote:
> Makes sense Joel, wasn't mentioned in the docs, so was a bit strange. 
> Still feels a bit weird but I'm sure I'll adapt_in and thrive_out.
Indeed, and as Joel said, we'll have n_features_out_ added soon.
Having both is quite helpful in many situations.

The naming is also meant to be analogous to future "feature_names_in_" 
and "feature_names_out_
attribute. Right now we have "get_feature_names()", which actually 
refers to the output features.

That's a whole lot of new attributes, but after quite a lot of 
deliberation that's the solution we came up with,
as there were major flaws in all other proposals.

The SLEP for that is being rewritten right now.
There's some conversation in 
https://github.com/scikit-learn/enhancement_proposals/pull/18 but the 
document doesn't reflect the current consensus.


> Downstream projectwise, I'm happy to bounce my dependencies up 
> whenever necessary. Always nice to support old versions of sklearn, 
> but not at the expense of spaghetti code from my persepctive, whatever 
> that's worth.
> Might be a bit more prickly for projects still trying to support Py2.x 
> though?
> On Wed, Dec 4, 2019 at 8:53 PM Joel Nothman <joel.nothman at gmail.com 
> <mailto:joel.nothman at gmail.com>> wrote:
>     We are looking to have n_features_out_ for transformers. This
>     naming makes the difference explicit.
>     I would like to see some guidance on how an estimator
>     implementation (e.g. in scikit-learn-contrib) is advised to
>     maintain compatibility with Scikit-learn pre- and post- SLEP010.
>     That is, we want to encourage developers to take advantage of
>     super()._validate_data(X, y), but we also don't want to force them
>     to set a minimal Scikit-learn >= 0.23 dependency (or do we?).
>     What's the recommended way to do implement fit and predict in such
>     an implementation?
>     Is it to
>     (a) not use _validate_data until the minimal dependency is reached?
>     (b) implement a patched BaseEstimator in the library which
>     inherits from Scikit-learn's BaseEstimator and adds _validate_data?
>     (c) something else?
>     Joel
>     _______________________________________________
>     scikit-learn mailing list
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>     https://mail.python.org/mailman/listinfo/scikit-learn
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20191204/ed25b35f/attachment.html>

More information about the scikit-learn mailing list