[scikit-learn] meta-estimator for multiple MLPRegressor

Tue Jan 10 14:47:23 EST 2017

Stuart,

I didn't see LASSO performing well, especially with the second type of
data. The alpha parameter probably needs adjustment with LassoCV.
I don't know if you have read my previous messages on this thread, so I
quote again my setting for MLPRegressor.

MLPRegressor(random_state=random_state, max_iter=400, early_stopping=True,
validation_fraction=0.2, alpha=10, hidden_layer_sizes=(10,))

So to sum up, I must select the lowest possible value for the following
parameters:

* max_iter
* hidden_layer_sizes (lower than 10?)
* number of features in my training data. I.e. the first type of data that
consisted of 60 features are preferable from that second that consisted of
456.

Is this correct?

On 10 January 2017 at 19:47, Stuart Reynolds <stuart at stuartreynolds.net>
wrote:

> Thomas,
> Jacob's point is important -- its not the number of features that's
> important, its the number of free parameters. As the number of free
> parameters increases, the space of representable functions grows to the
> point where the cost function is minimized by having a single parameter
> explain each variable. This is true of many ML methods.
>
> In the case of a decision trees, for example you can allow each node (a
> free parameter) hold exactly 1 training example, and see perfect training
> performance. In linear methods, you can perfectly fit training data by
> adding additional polynomial features (for feature x_i, add x^2_i,  x^3_i,
>  x^4_i, ....) Performance on unseen data will be terrible.
> MLP is no different -- adding more free parameters (more flexibility to
> precisely model the training data) may harm more than help when it comes to
> unseen data performance, especially when the number of examples it small.
>
> Early stopping may help overfitting, as might dropout.
>
> The likely reasons that LASSO and GBR performed well is that they're
> methods that explicit manage overfitting.
>
> Perform a grid search on:
>  - the number of hidden nodes in you MLP.
>  - the number of iterations
>
> for both, you may find lowering values will improve performance on unseen
> data.
>
>
>
>
>
>
>
>
>
> On Tue, Jan 10, 2017 at 4:46 AM, Thomas Evangelidis <tevang3 at gmail.com>
> wrote:
>
>> Jacob,
>>
>> The features are not 6000. I train 2 MLPRegressors from two types of
>> data, both refer to the same dataset (35 molecules in total) but each
>> one contains different type of information. The first data consist of 60
>> features. I tried 100 different random states and measured the average |R|
>> using the leave-20%-out cross-validation. Below are the results from the
>> first data:
>>
>> RandomForestRegressor: |R|= 0.389018243545 +- 0.252891783658
>> LASSO: |R|= 0.247411754937 +- 0.232325286471
>> GradientBoostingRegressor: |R|= 0.324483769202 +- 0.211778410841
>> MLPRegressor: |R|= 0.540528696597 +- 0.255714448793
>>
>> The second type of data consist of 456 features. Below are the results
>> for these too:
>>
>> RandomForestRegressor: |R|= 0.361562548904 +- 0.234872385318
>> LASSO: |R|= 3.27752711304e-16 +- 2.60800139195e-16
>> GradientBoostingRegressor: |R|= 0.328087138161 +- 0.229588427086
>> MLPRegressor: |R|= 0.455473342507 +- 0.24579081197
>>
>>
>> At the end I want to combine models created from these data types using a
>> meta-estimator (that was my original question). The combination with the
>> highest |R| (0.631851796403 +- 0.247911204514) was produced by an SVR
>> that combined the best MLPRegressor from data type 1 and the best
>> MLPRegressor from data type2:
>>
>>
>>
>>
>>
>> On 10 January 2017 at 01:36, Jacob Schreiber <jmschreiber91 at gmail.com>
>> wrote:
>>
>>> Even with a single layer with 10 neurons you're still trying to train
>>> over 6000 parameters using ~30 samples. Dropout is a concept common in
>>> neural networks, but doesn't appear to be in sklearn's implementation of
>>> MLPs. Early stopping based on validation performance isn't an "extra" step
>>> for reducing overfitting, it's basically a required step for neural
>>> networks. It seems like you have a validation sample of ~6 datapoints.. I'm
>>> still very skeptical of that giving you proper results for a complex model.
>>> Will this larger dataset be of exactly the same data? Just taking another
>>> unrelated dataset and showing that a MLP can learn it doesn't mean it will
>>> work for your specific data. Can you post the actual results from using
>>> LASSO, RandomForestRegressor, GradientBoostingRegressor, and MLP?
>>>
>>> On Mon, Jan 9, 2017 at 4:21 PM, Stuart Reynolds <
>>> stuart at stuartreynolds.net> wrote:
>>>
>>>> If you dont have a large dataset, you can still do leave one out cross
>>>> validation.
>>>>
>>>> On Mon, Jan 9, 2017 at 3:42 PM Thomas Evangelidis <tevang3 at gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>> Jacob & Sebastian,
>>>>>
>>>>> I think the best way to find out if my modeling approach works is to
>>>>> find a larger dataset, split it into two parts, the first one will be used
>>>>> as training/cross-validation set and the second as a test set, like in a
>>>>> real case scenario.
>>>>>
>>>>> Regarding the MLPRegressor regularization, below is my optimum setup:
>>>>>
>>>>> MLPRegressor(random_state=random_state, max_iter=400,
>>>>> early_stopping=True, validation_fraction=0.2, alpha=10,
>>>>> hidden_layer_sizes=(10,))
>>>>>
>>>>>
>>>>> This means only one hidden layer with maximum 10 neurons, alpha=10 for
>>>>> L2 regularization and early stopping to terminate training if validation
>>>>> score is not improving. I think this is a quite simple model. My final
>>>>> predictor is an SVR that combines 2 MLPRegressors, each one trained with
>>>>> different types of input data.
>>>>>
>>>>> @Sebastian
>>>>> You have mentioned dropout again but I could not find it in the docs:
>>>>> http://scikit-learn.org/stable/modules/generated/sklearn.neu
>>>>> ral_network.MLPRegressor.html#sklearn.neural_network.MLPRegressor
>>>>>
>>>>> Maybe you are referring to another MLPRegressor implementation? I have
>>>>> seen a while ago another implementation you had on github. Can you clarify
>>>>> which one you recommend and why?
>>>>>
>>>>>
>>>>> Thank you both of you for your hints!
>>>>>
>>>>> best
>>>>> Thomas
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ======================================================================
>>>>>
>>>>>
>>>>> Thomas Evangelidis
>>>>>
>>>>>
>>>>> Research Specialist
>>>>> CEITEC - Central European Institute of Technology
>>>>> Masaryk University
>>>>> Kamenice 5/A35/1S081,
>>>>> 62500 Brno, Czech Republic
>>>>>
>>>>> email: tevang at pharm.uoa.gr
>>>>>
>>>>>
>>>>>           tevang3 at gmail.com
>>>>>
>>>>>
>>>>>
>>>>> website:
>>>>>
>>>>> https://sites.google.com/site/thomasevangelidishomepage/
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>>
>>>>> scikit-learn mailing list
>>>>>
>>>>> scikit-learn at python.org
>>>>>
>>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>>>
>>>>>
>>>> _______________________________________________
>>>> scikit-learn mailing list
>>>> scikit-learn at python.org
>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>>
>>>>
>>>
>>> _______________________________________________
>>> scikit-learn mailing list
>>> scikit-learn at python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>
>>>
>>
>>
>> --
>>
>> ======================================================================
>>
>> Thomas Evangelidis
>>
>> Research Specialist
>> CEITEC - Central European Institute of Technology
>> Masaryk University
>> Kamenice 5/A35/1S081,
>> 62500 Brno, Czech Republic
>>
>> email: tevang at pharm.uoa.gr
>>
>>           tevang3 at gmail.com
>>
>>
>> website: https://sites.google.com/site/thomasevangelidishomepage/
>>
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>

-- 

======================================================================

Thomas Evangelidis

Research Specialist
CEITEC - Central European Institute of Technology
Masaryk University
Kamenice 5/A35/1S081,
62500 Brno, Czech Republic

email: tevang at pharm.uoa.gr

          tevang3 at gmail.com

website: https://sites.google.com/site/thomasevangelidishomepage/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170110/3bd27ba5/attachment-0001.html>