[scikit-learn] recommended feature selection method to train an MLPRegressor

Sun Mar 19 19:32:45 EDT 2017

Hm, that’s tricky. I think the other methods listed on http://scikit-learn.org/stable/modules/feature_selection.html could help regarding a computationally cheap solution, but the problem would be that they probably wouldn’t work that well for an MLP due to the linear assumption. And an exhaustive sampling of all subsets would also be impractical/impossible. For all 50 feature subsets, you already have 73353053308199416032348518540326808282134507009732998441913227684085760 combinations :P. A greedy solution like forward or backward selection would be more feasible, but still very expensive in combination with an MLP. On top of that, you also have to consider that neural networks are generally pretty sensitive to hyperparameter settings. So even if you fix the architecture, you probably still want to check if the learning rate etc. is appropriate for each combination of features (by checking the cost and validation error during training).

PS: I wouldn’t dismiss dropout, imho. Especially because your training set is small, it could be even crucial to reduce overfitting. I mean it doesn’t remove features from your dataset but just helps the network to rely on particular combinations of features to be always present during training. Your final network will still process all features and dropout will effectively cause your network to “use” more of those features in your ~50 feature subset compared to no dropout (because otherwise, it may just learn to rely of a subset of these 50 features).

> On Mar 19, 2017, at 6:23 PM, Andreas Mueller <t3kcit at gmail.com> wrote:
> 
> 
> 
> On 03/19/2017 03:47 PM, Thomas Evangelidis wrote:
>> Which of the following methods would you recommend to select good features (<=50) from a set of 534 features in order to train a MLPregressor? Please take into account that the datasets I use for training are small.
>> 
>> http://scikit-learn.org/stable/modules/feature_selection.html
>> 
>> And please don't tell me to use a neural network that supports the dropout or any other algorithm for feature elimination. This is not applicable in my case because I want to know the best 50 features in order to append them to other types of feature that I am confident that are important.
>> 
> You can always use forward or backward selection as implemented in mlxtend if you're patient. As your dataset is small that might work.
> However, it might be hard tricky to get the MLP to run consistently - though maybe not...
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn