meta-estimator for multiple MLPRegressor
Greetings, I have trained many MLPRegressors using different random_state value and estimated the R^2 using cross-validation. Now I want to combine the top 10% of them in how to get more accurate predictions. Is there a meta-estimator that can get as input a few precomputed MLPRegressors and give consensus predictions? Can the BaggingRegressor do this job using MLPRegressors as input? Thanks in advance for any hint. Thomas -- ====================================================================== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic email: tevang@pharm.uoa.gr tevang3@gmail.com website: https://sites.google.com/site/thomasevangelidishomepage/
Hi, Thomas, the VotingClassifier can combine different models per majority voting amongst their predictions. Unfortunately, it refits the classifiers though (after cloning them). I think we implemented it this way to make it compatible to GridSearch and so forth. However, I have a version of the estimator that you can initialize with “refit=False” to avoid refitting if it helps. http://rasbt.github.io/mlxtend/user_guide/classifier/EnsembleVoteClassifier/... Best, Sebastian
On Jan 7, 2017, at 11:15 AM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Greetings,
I have trained many MLPRegressors using different random_state value and estimated the R^2 using cross-validation. Now I want to combine the top 10% of them in how to get more accurate predictions. Is there a meta-estimator that can get as input a few precomputed MLPRegressors and give consensus predictions? Can the BaggingRegressor do this job using MLPRegressors as input?
Thanks in advance for any hint. Thomas
-- ====================================================================== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Hi Sebastian, Thanks, I will try it in another classification problem I have. However, this time I am using regressors not classifiers. On Jan 7, 2017 19:28, "Sebastian Raschka" <se.raschka@gmail.com> wrote:
Hi, Thomas,
the VotingClassifier can combine different models per majority voting amongst their predictions. Unfortunately, it refits the classifiers though (after cloning them). I think we implemented it this way to make it compatible to GridSearch and so forth. However, I have a version of the estimator that you can initialize with “refit=False” to avoid refitting if it helps. http://rasbt.github.io/mlxtend/user_guide/classifier/ EnsembleVoteClassifier/#example-5-using-pre-fitted-classifiers
Best, Sebastian
On Jan 7, 2017, at 11:15 AM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Greetings,
I have trained many MLPRegressors using different random_state value and estimated the R^2 using cross-validation. Now I want to combine the top 10% of them in how to get more accurate predictions. Is there a meta-estimator that can get as input a few precomputed MLPRegressors and give consensus predictions? Can the BaggingRegressor do this job using MLPRegressors as input?
Thanks in advance for any hint. Thomas
-- ====================================================================== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Hi, Thomas, sorry, I overread the regression part … This would be a bit trickier, I am not sure what a good strategy for averaging regression outputs would be. However, if you just want to compute the average, you could do sth like np.mean(np.asarray([r.predict(X) for r in list_or_your_mlps])) However, it may be better to use stacking, and use the output of r.predict(X) as meta features to train a model based on these? Best, Sebastian
On Jan 7, 2017, at 1:49 PM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Hi Sebastian,
Thanks, I will try it in another classification problem I have. However, this time I am using regressors not classifiers.
On Jan 7, 2017 19:28, "Sebastian Raschka" <se.raschka@gmail.com> wrote: Hi, Thomas,
the VotingClassifier can combine different models per majority voting amongst their predictions. Unfortunately, it refits the classifiers though (after cloning them). I think we implemented it this way to make it compatible to GridSearch and so forth. However, I have a version of the estimator that you can initialize with “refit=False” to avoid refitting if it helps. http://rasbt.github.io/mlxtend/user_guide/classifier/EnsembleVoteClassifier/...
Best, Sebastian
On Jan 7, 2017, at 11:15 AM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Greetings,
I have trained many MLPRegressors using different random_state value and estimated the R^2 using cross-validation. Now I want to combine the top 10% of them in how to get more accurate predictions. Is there a meta-estimator that can get as input a few precomputed MLPRegressors and give consensus predictions? Can the BaggingRegressor do this job using MLPRegressors as input?
Thanks in advance for any hint. Thomas
-- ====================================================================== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
On 7 January 2017 at 21:20, Sebastian Raschka <se.raschka@gmail.com> wrote:
Hi, Thomas, sorry, I overread the regression part … This would be a bit trickier, I am not sure what a good strategy for averaging regression outputs would be. However, if you just want to compute the average, you could do sth like np.mean(np.asarray([r.predict(X) for r in list_or_your_mlps]))
However, it may be better to use stacking, and use the output of r.predict(X) as meta features to train a model based on these?
Like to train an SVR to combine the predictions of the top 10% MLPRegressors using the same data that were used for training of the MLPRegressors? Wouldn't that lead to overfitting?
Best, Sebastian
On Jan 7, 2017, at 1:49 PM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Hi Sebastian,
Thanks, I will try it in another classification problem I have. However, this time I am using regressors not classifiers.
On Jan 7, 2017 19:28, "Sebastian Raschka" <se.raschka@gmail.com> wrote: Hi, Thomas,
the VotingClassifier can combine different models per majority voting amongst their predictions. Unfortunately, it refits the classifiers though (after cloning them). I think we implemented it this way to make it compatible to GridSearch and so forth. However, I have a version of the estimator that you can initialize with “refit=False” to avoid refitting if it helps. http://rasbt.github.io/mlxtend/user_guide/classifier/ EnsembleVoteClassifier/#example-5-using-pre-fitted-classifiers
Best, Sebastian
On Jan 7, 2017, at 11:15 AM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Greetings,
I have trained many MLPRegressors using different random_state value and estimated the R^2 using cross-validation. Now I want to combine the top 10% of them in how to get more accurate predictions. Is there a meta-estimator that can get as input a few precomputed MLPRegressors and give consensus predictions? Can the BaggingRegressor do this job using MLPRegressors as input?
Thanks in advance for any hint. Thomas
-- ====================================================================== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
-- ====================================================================== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic email: tevang@pharm.uoa.gr tevang3@gmail.com website: https://sites.google.com/site/thomasevangelidishomepage/
On 8 January 2017 at 08:36, Thomas Evangelidis <tevang3@gmail.com> wrote:
On 7 January 2017 at 21:20, Sebastian Raschka <se.raschka@gmail.com> wrote:
Hi, Thomas, sorry, I overread the regression part … This would be a bit trickier, I am not sure what a good strategy for averaging regression outputs would be. However, if you just want to compute the average, you could do sth like np.mean(np.asarray([r.predict(X) for r in list_or_your_mlps]))
However, it may be better to use stacking, and use the output of r.predict(X) as meta features to train a model based on these?
Like to train an SVR to combine the predictions of the top 10% MLPRegressors using the same data that were used for training of the MLPRegressors? Wouldn't that lead to overfitting?
You could certainly hold out a different data sample and that might indeed be valuable regularisation, but it's not obvious to me that this is substantially more prone to overfitting than just training a handful of MLPRegressors on the same data and having them vote by other means. There is no problem, in general, with overfitting, as long as your evaluation of an estimator's performance isn't biased towards the training set. We've not talked about overfitting.
Regarding the evaluation, I use the leave 20% out cross validation method. I cannot leave more out because my data sets are very small, between 30 and 40 observations, each one with 600 features. Is there a limit in the number of MLPRegressors I can combine with stacking considering my small data sets? On Jan 7, 2017 23:04, "Joel Nothman" <joel.nothman@gmail.com> wrote:
*
There is no problem, in general, with overfitting, as long as your evaluation of an estimator's performance isn't biased towards the training set. We've not talked about evaluation.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
If you have such a small number of observations (with a much higher feature space) then why do you think you can accurately train not just a single MLP, but an ensemble of them without overfitting dramatically? On Sat, Jan 7, 2017 at 2:26 PM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Regarding the evaluation, I use the leave 20% out cross validation method. I cannot leave more out because my data sets are very small, between 30 and 40 observations, each one with 600 features. Is there a limit in the number of MLPRegressors I can combine with stacking considering my small data sets?
On Jan 7, 2017 23:04, "Joel Nothman" <joel.nothman@gmail.com> wrote:
*
There is no problem, in general, with overfitting, as long as your evaluation of an estimator's performance isn't biased towards the training set. We've not talked about evaluation.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
On 8 January 2017 at 00:04, Jacob Schreiber <jmschreiber91@gmail.com> wrote:
If you have such a small number of observations (with a much higher feature space) then why do you think you can accurately train not just a single MLP, but an ensemble of them without overfitting dramatically?
Because the observations in the data set don't differ much between them. To be more specific, the data set consists of a congeneric series of organic molecules and the ebservation is their binding strength to a target protein. The idea was to train predictors that can predict the binding strenght of new molecules that belong to the same congeneric series. Therefore special care is taken to apply the predictors to the right domain of applicability. According to the literature, the same strategy has been followed in the past several times. The novelty of my approach stems from other factors that are irrelevant to this thread. -- ====================================================================== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic email: tevang@pharm.uoa.gr tevang3@gmail.com website: https://sites.google.com/site/thomasevangelidishomepage/
This is an aside to what your original question was, but as someone who has dealt with similar data in bioinformatics (gene expression, specifically) I think you should tread -very- carefully if you have such a small sample set and more dimensions than features. MLPs are already prone to overfit and both of those factors would make me inherently suspicious of the results. This sounds like an easy way to trick yourself into thinking you are making good predictions. Perhaps consider LASSO? Back to the original question, it is true that using a SVR in a stacking technique would add more parameters to your model, but it is likely an insignificant amount when compared to the MLPs themselves. Alternatively you may consider using LASSO using all of the MLPs (not just the top 10%) so you can learn which ones yield useful features for a meta-estimator instead of just selecting the top 10%. On Sat, Jan 7, 2017 at 4:01 PM, Thomas Evangelidis <tevang3@gmail.com> wrote:
On 8 January 2017 at 00:04, Jacob Schreiber <jmschreiber91@gmail.com> wrote:
If you have such a small number of observations (with a much higher feature space) then why do you think you can accurately train not just a single MLP, but an ensemble of them without overfitting dramatically?
Because the observations in the data set don't differ much between them. To be more specific, the data set consists of a congeneric series of organic molecules and the ebservation is their binding strength to a target protein. The idea was to train predictors that can predict the binding strenght of new molecules that belong to the same congeneric series. Therefore special care is taken to apply the predictors to the right domain of applicability. According to the literature, the same strategy has been followed in the past several times. The novelty of my approach stems from other factors that are irrelevant to this thread.
--
======================================================================
Thomas Evangelidis
Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr
tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Like to train an SVR to combine the predictions of the top 10% MLPRegressors using the same data that were used for training of the MLPRegressors? Wouldn't that lead to overfitting?
It could, but you don't need to use the same data that you used for training to fit the meta estimator. Like it is commonly done in stacking with cross validation, you can train the mlps on training folds and pass predictions from a test fold to the meta estimator but then you'd have to retrain your mlps and it sounded like you wanted to avoid that. I am currently on mobile and only browsed through the thread briefly, but I agree with others that it may sound like your model(s) may have too much capacity for such a small dataset -- can be tricky to fit the parameters without overfitting. In any case, if you to do the stacking, I'd probably insert a k-fold cv between the mlps and the meta estimator. However I'd definitely also recommend simpler models als alternative. Best, Sebastian
On Jan 7, 2017, at 4:36 PM, Thomas Evangelidis <tevang3@gmail.com> wrote:
On 7 January 2017 at 21:20, Sebastian Raschka <se.raschka@gmail.com> wrote: Hi, Thomas, sorry, I overread the regression part … This would be a bit trickier, I am not sure what a good strategy for averaging regression outputs would be. However, if you just want to compute the average, you could do sth like np.mean(np.asarray([r.predict(X) for r in list_or_your_mlps]))
However, it may be better to use stacking, and use the output of r.predict(X) as meta features to train a model based on these?
Like to train an SVR to combine the predictions of the top 10% MLPRegressors using the same data that were used for training of the MLPRegressors? Wouldn't that lead to overfitting?
Best, Sebastian
On Jan 7, 2017, at 1:49 PM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Hi Sebastian,
Thanks, I will try it in another classification problem I have. However, this time I am using regressors not classifiers.
On Jan 7, 2017 19:28, "Sebastian Raschka" <se.raschka@gmail.com> wrote: Hi, Thomas,
the VotingClassifier can combine different models per majority voting amongst their predictions. Unfortunately, it refits the classifiers though (after cloning them). I think we implemented it this way to make it compatible to GridSearch and so forth. However, I have a version of the estimator that you can initialize with “refit=False” to avoid refitting if it helps. http://rasbt.github.io/mlxtend/user_guide/classifier/EnsembleVoteClassifier/...
Best, Sebastian
On Jan 7, 2017, at 11:15 AM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Greetings,
I have trained many MLPRegressors using different random_state value and estimated the R^2 using cross-validation. Now I want to combine the top 10% of them in how to get more accurate predictions. Is there a meta-estimator that can get as input a few precomputed MLPRegressors and give consensus predictions? Can the BaggingRegressor do this job using MLPRegressors as input?
Thanks in advance for any hint. Thomas
-- ====================================================================== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
-- ====================================================================== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Sebastian and Jacob, Regarding overfitting, Lasso, Ridge regression and ElasticNet have poor performance on my data. MLPregressors are way superior. On an other note, MLPregressor class has some methods to contol overfitting, like controling the alpha parameter for the L2 regularization (maybe setting it to a high value?) or the number of neurons in the hidden layers (lowering the hidden_layer_sizes?) or even "early_stopping=True". Wouldn't these be sufficient to be on the safe side. Once more I want to highlight something I wrote previously but might have been overlooked. The resulting MLPRegressors will be applied to new datasets that *ARE VERY SIMILAR TO THE TRAINING DATA*. In other words the application of the models will be strictly confined to their applicability domain. Wouldn't that be sufficient to not worry about model overfitting too much? On 8 January 2017 at 11:53, Sebastian Raschka <se.raschka@gmail.com> wrote:
Like to train an SVR to combine the predictions of the top 10% MLPRegressors using the same data that were used for training of the MLPRegressors? Wouldn't that lead to overfitting?
It could, but you don't need to use the same data that you used for training to fit the meta estimator. Like it is commonly done in stacking with cross validation, you can train the mlps on training folds and pass predictions from a test fold to the meta estimator but then you'd have to retrain your mlps and it sounded like you wanted to avoid that.
I am currently on mobile and only browsed through the thread briefly, but I agree with others that it may sound like your model(s) may have too much capacity for such a small dataset -- can be tricky to fit the parameters without overfitting. In any case, if you to do the stacking, I'd probably insert a k-fold cv between the mlps and the meta estimator. However I'd definitely also recommend simpler models als alternative.
Best, Sebastian
On Jan 7, 2017, at 4:36 PM, Thomas Evangelidis <tevang3@gmail.com> wrote:
On 7 January 2017 at 21:20, Sebastian Raschka <se.raschka@gmail.com> wrote:
Hi, Thomas, sorry, I overread the regression part … This would be a bit trickier, I am not sure what a good strategy for averaging regression outputs would be. However, if you just want to compute the average, you could do sth like np.mean(np.asarray([r.predict(X) for r in list_or_your_mlps]))
However, it may be better to use stacking, and use the output of r.predict(X) as meta features to train a model based on these?
Like to train an SVR to combine the predictions of the top 10% MLPRegressors using the same data that were used for training of the MLPRegressors? Wouldn't that lead to overfitting?
Best, Sebastian
On Jan 7, 2017, at 1:49 PM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Hi Sebastian,
Thanks, I will try it in another classification problem I have. However, this time I am using regressors not classifiers.
On Jan 7, 2017 19:28, "Sebastian Raschka" <se.raschka@gmail.com> wrote: Hi, Thomas,
the VotingClassifier can combine different models per majority voting amongst their predictions. Unfortunately, it refits the classifiers though (after cloning them). I think we implemented it this way to make it compatible to GridSearch and so forth. However, I have a version of the estimator that you can initialize with “refit=False” to avoid refitting if it helps. http://rasbt.github.io/mlxtend/user_guide/classifier/Ensembl eVoteClassifier/#example-5-using-pre-fitted-classifiers
Best, Sebastian
On Jan 7, 2017, at 11:15 AM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Greetings,
I have trained many MLPRegressors using different random_state value and estimated the R^2 using cross-validation. Now I want to combine the top 10% of them in how to get more accurate predictions. Is there a meta-estimator that can get as input a few precomputed MLPRegressors and give consensus predictions? Can the BaggingRegressor do this job using MLPRegressors as input?
Thanks in advance for any hint. Thomas
-- ============================================================ ========== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
--
======================================================================
Thomas Evangelidis
Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr
tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
-- ====================================================================== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic email: tevang@pharm.uoa.gr tevang3@gmail.com website: https://sites.google.com/site/thomasevangelidishomepage/
Btw, I may have been unclear in the discussion of overfitting. For *training* the meta-estimator in stacking, it's standard to do something like cross_val_predict on your training set to produce its input features. On 8 January 2017 at 22:42, Thomas Evangelidis <tevang3@gmail.com> wrote:
Sebastian and Jacob,
Regarding overfitting, Lasso, Ridge regression and ElasticNet have poor performance on my data. MLPregressors are way superior. On an other note, MLPregressor class has some methods to contol overfitting, like controling the alpha parameter for the L2 regularization (maybe setting it to a high value?) or the number of neurons in the hidden layers (lowering the hidden_layer_sizes?) or even "early_stopping=True". Wouldn't these be sufficient to be on the safe side.
Once more I want to highlight something I wrote previously but might have been overlooked. The resulting MLPRegressors will be applied to new datasets that *ARE VERY SIMILAR TO THE TRAINING DATA*. In other words the application of the models will be strictly confined to their applicability domain. Wouldn't that be sufficient to not worry about model overfitting too much?
On 8 January 2017 at 11:53, Sebastian Raschka <se.raschka@gmail.com> wrote:
Like to train an SVR to combine the predictions of the top 10% MLPRegressors using the same data that were used for training of the MLPRegressors? Wouldn't that lead to overfitting?
It could, but you don't need to use the same data that you used for training to fit the meta estimator. Like it is commonly done in stacking with cross validation, you can train the mlps on training folds and pass predictions from a test fold to the meta estimator but then you'd have to retrain your mlps and it sounded like you wanted to avoid that.
I am currently on mobile and only browsed through the thread briefly, but I agree with others that it may sound like your model(s) may have too much capacity for such a small dataset -- can be tricky to fit the parameters without overfitting. In any case, if you to do the stacking, I'd probably insert a k-fold cv between the mlps and the meta estimator. However I'd definitely also recommend simpler models als alternative.
Best, Sebastian
On Jan 7, 2017, at 4:36 PM, Thomas Evangelidis <tevang3@gmail.com> wrote:
On 7 January 2017 at 21:20, Sebastian Raschka <se.raschka@gmail.com> wrote:
Hi, Thomas, sorry, I overread the regression part … This would be a bit trickier, I am not sure what a good strategy for averaging regression outputs would be. However, if you just want to compute the average, you could do sth like np.mean(np.asarray([r.predict(X) for r in list_or_your_mlps]))
However, it may be better to use stacking, and use the output of r.predict(X) as meta features to train a model based on these?
Like to train an SVR to combine the predictions of the top 10% MLPRegressors using the same data that were used for training of the MLPRegressors? Wouldn't that lead to overfitting?
Best, Sebastian
On Jan 7, 2017, at 1:49 PM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Hi Sebastian,
Thanks, I will try it in another classification problem I have. However, this time I am using regressors not classifiers.
On Jan 7, 2017 19:28, "Sebastian Raschka" <se.raschka@gmail.com> wrote: Hi, Thomas,
the VotingClassifier can combine different models per majority voting amongst their predictions. Unfortunately, it refits the classifiers though (after cloning them). I think we implemented it this way to make it compatible to GridSearch and so forth. However, I have a version of the estimator that you can initialize with “refit=False” to avoid refitting if it helps. http://rasbt.github.io/mlxtend/user_guide/classifier/Ensembl eVoteClassifier/#example-5-using-pre-fitted-classifiers
Best, Sebastian
On Jan 7, 2017, at 11:15 AM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Greetings,
I have trained many MLPRegressors using different random_state value and estimated the R^2 using cross-validation. Now I want to combine the top 10% of them in how to get more accurate predictions. Is there a meta-estimator that can get as input a few precomputed MLPRegressors and give consensus predictions? Can the BaggingRegressor do this job using MLPRegressors as input?
Thanks in advance for any hint. Thomas
-- ============================================================ ========== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
--
======================================================================
Thomas Evangelidis
Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr
tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
--
======================================================================
Thomas Evangelidis
Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr
tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Thomas, it can be difficult to fine tune L1/L2 regularization in the case where n_parameters >>> n_samples ~and~ n_features >> n_samples. If your samples are very similar to the training data, why are simpler models not working well? On Sun, Jan 8, 2017 at 8:08 PM, Joel Nothman <joel.nothman@gmail.com> wrote:
Btw, I may have been unclear in the discussion of overfitting. For *training* the meta-estimator in stacking, it's standard to do something like cross_val_predict on your training set to produce its input features.
On 8 January 2017 at 22:42, Thomas Evangelidis <tevang3@gmail.com> wrote:
Sebastian and Jacob,
Regarding overfitting, Lasso, Ridge regression and ElasticNet have poor performance on my data. MLPregressors are way superior. On an other note, MLPregressor class has some methods to contol overfitting, like controling the alpha parameter for the L2 regularization (maybe setting it to a high value?) or the number of neurons in the hidden layers (lowering the hidden_layer_sizes?) or even "early_stopping=True". Wouldn't these be sufficient to be on the safe side.
Once more I want to highlight something I wrote previously but might have been overlooked. The resulting MLPRegressors will be applied to new datasets that *ARE VERY SIMILAR TO THE TRAINING DATA*. In other words the application of the models will be strictly confined to their applicability domain. Wouldn't that be sufficient to not worry about model overfitting too much?
On 8 January 2017 at 11:53, Sebastian Raschka <se.raschka@gmail.com> wrote:
Like to train an SVR to combine the predictions of the top 10% MLPRegressors using the same data that were used for training of the MLPRegressors? Wouldn't that lead to overfitting?
It could, but you don't need to use the same data that you used for training to fit the meta estimator. Like it is commonly done in stacking with cross validation, you can train the mlps on training folds and pass predictions from a test fold to the meta estimator but then you'd have to retrain your mlps and it sounded like you wanted to avoid that.
I am currently on mobile and only browsed through the thread briefly, but I agree with others that it may sound like your model(s) may have too much capacity for such a small dataset -- can be tricky to fit the parameters without overfitting. In any case, if you to do the stacking, I'd probably insert a k-fold cv between the mlps and the meta estimator. However I'd definitely also recommend simpler models als alternative.
Best, Sebastian
On Jan 7, 2017, at 4:36 PM, Thomas Evangelidis <tevang3@gmail.com> wrote:
On 7 January 2017 at 21:20, Sebastian Raschka <se.raschka@gmail.com> wrote:
Hi, Thomas, sorry, I overread the regression part … This would be a bit trickier, I am not sure what a good strategy for averaging regression outputs would be. However, if you just want to compute the average, you could do sth like np.mean(np.asarray([r.predict(X) for r in list_or_your_mlps]))
However, it may be better to use stacking, and use the output of r.predict(X) as meta features to train a model based on these?
Like to train an SVR to combine the predictions of the top 10% MLPRegressors using the same data that were used for training of the MLPRegressors? Wouldn't that lead to overfitting?
Best, Sebastian
On Jan 7, 2017, at 1:49 PM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Hi Sebastian,
Thanks, I will try it in another classification problem I have. However, this time I am using regressors not classifiers.
On Jan 7, 2017 19:28, "Sebastian Raschka" <se.raschka@gmail.com> wrote: Hi, Thomas,
the VotingClassifier can combine different models per majority voting amongst their predictions. Unfortunately, it refits the classifiers though (after cloning them). I think we implemented it this way to make it compatible to GridSearch and so forth. However, I have a version of the estimator that you can initialize with “refit=False” to avoid refitting if it helps. http://rasbt.github.io/mlxtend/user_guide/classifier/Ensembl eVoteClassifier/#example-5-using-pre-fitted-classifiers
Best, Sebastian
On Jan 7, 2017, at 11:15 AM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Greetings,
I have trained many MLPRegressors using different random_state value and estimated the R^2 using cross-validation. Now I want to combine the top 10% of them in how to get more accurate predictions. Is there a meta-estimator that can get as input a few precomputed MLPRegressors and give consensus predictions? Can the BaggingRegressor do this job using MLPRegressors as input?
Thanks in advance for any hint. Thomas
-- ============================================================ ========== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
--
======================================================================
Thomas Evangelidis
Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr
tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
--
======================================================================
Thomas Evangelidis
Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr
tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Once more I want to highlight something I wrote previously but might have been overlooked. The resulting MLPRegressors will be applied to new datasets that ARE VERY SIMILAR TO THE TRAINING DATA. In other words the application of the models will be strictly confined to their applicability domain. Wouldn't that be sufficient to not worry about model overfitting too much?
If you have a very small dataset and a very large number of features, I’d always be careful with/avoid models that have a high capacity. However, it is really hard to answer this question because we don’t know much about your training and evaluation approach. If you didn’t do much hyperparameter tuning and cross-validation for model selection, and if you set aside a sufficiently large portion as an independent test set that you only looked at once and get a good performance on that, you may be lucky and a complex MLP may generalize well. However, like others said, it’s really hard to get an MLP right (not memorizing training data) if n_samples is small and n_features is large. And for n_features > n_samples, that may be very, very hard.
like controling the alpha parameter for the L2 regularization (maybe setting it to a high value?) or the number of neurons in the hidden layers (lowering the hidden_layer_sizes?) or even "early_stopping=True"
As a rule of thumb, the higher the capacity the higher the degree/chance of overfitting. So yes, this could help a little bit. You probably also want to try dropout instead of L2 (or in addition), which usually has a stronger effect on regularization (esp. if you have a very large set of redundant features). Can’t remember the exact paper, but I read about an approach where the authors set a max constraint for the weights in combination with dropout, e.g. “ ||w||_2 < constant “, which worked even better than dropout alone (the constant becomes another hyperparm to tune though). Best, Sebastian
On Jan 9, 2017, at 1:21 PM, Jacob Schreiber <jmschreiber91@gmail.com> wrote:
Thomas, it can be difficult to fine tune L1/L2 regularization in the case where n_parameters >>> n_samples ~and~ n_features >> n_samples. If your samples are very similar to the training data, why are simpler models not working well?
On Sun, Jan 8, 2017 at 8:08 PM, Joel Nothman <joel.nothman@gmail.com> wrote: Btw, I may have been unclear in the discussion of overfitting. For *training* the meta-estimator in stacking, it's standard to do something like cross_val_predict on your training set to produce its input features.
On 8 January 2017 at 22:42, Thomas Evangelidis <tevang3@gmail.com> wrote: Sebastian and Jacob,
Regarding overfitting, Lasso, Ridge regression and ElasticNet have poor performance on my data. MLPregressors are way superior. On an other note, MLPregressor class has some methods to contol overfitting, like controling the alpha parameter for the L2 regularization (maybe setting it to a high value?) or the number of neurons in the hidden layers (lowering the hidden_layer_sizes?) or even "early_stopping=True". Wouldn't these be sufficient to be on the safe side.
Once more I want to highlight something I wrote previously but might have been overlooked. The resulting MLPRegressors will be applied to new datasets that ARE VERY SIMILAR TO THE TRAINING DATA. In other words the application of the models will be strictly confined to their applicability domain. Wouldn't that be sufficient to not worry about model overfitting too much?
On 8 January 2017 at 11:53, Sebastian Raschka <se.raschka@gmail.com> wrote:
Like to train an SVR to combine the predictions of the top 10% MLPRegressors using the same data that were used for training of the MLPRegressors? Wouldn't that lead to overfitting?
It could, but you don't need to use the same data that you used for training to fit the meta estimator. Like it is commonly done in stacking with cross validation, you can train the mlps on training folds and pass predictions from a test fold to the meta estimator but then you'd have to retrain your mlps and it sounded like you wanted to avoid that.
I am currently on mobile and only browsed through the thread briefly, but I agree with others that it may sound like your model(s) may have too much capacity for such a small dataset -- can be tricky to fit the parameters without overfitting. In any case, if you to do the stacking, I'd probably insert a k-fold cv between the mlps and the meta estimator. However I'd definitely also recommend simpler models als alternative.
Best, Sebastian
On Jan 7, 2017, at 4:36 PM, Thomas Evangelidis <tevang3@gmail.com> wrote:
On 7 January 2017 at 21:20, Sebastian Raschka <se.raschka@gmail.com> wrote: Hi, Thomas, sorry, I overread the regression part … This would be a bit trickier, I am not sure what a good strategy for averaging regression outputs would be. However, if you just want to compute the average, you could do sth like np.mean(np.asarray([r.predict(X) for r in list_or_your_mlps]))
However, it may be better to use stacking, and use the output of r.predict(X) as meta features to train a model based on these?
Like to train an SVR to combine the predictions of the top 10% MLPRegressors using the same data that were used for training of the MLPRegressors? Wouldn't that lead to overfitting?
Best, Sebastian
On Jan 7, 2017, at 1:49 PM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Hi Sebastian,
Thanks, I will try it in another classification problem I have. However, this time I am using regressors not classifiers.
On Jan 7, 2017 19:28, "Sebastian Raschka" <se.raschka@gmail.com> wrote: Hi, Thomas,
the VotingClassifier can combine different models per majority voting amongst their predictions. Unfortunately, it refits the classifiers though (after cloning them). I think we implemented it this way to make it compatible to GridSearch and so forth. However, I have a version of the estimator that you can initialize with “refit=False” to avoid refitting if it helps. http://rasbt.github.io/mlxtend/user_guide/classifier/EnsembleVoteClassifier/...
Best, Sebastian
On Jan 7, 2017, at 11:15 AM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Greetings,
I have trained many MLPRegressors using different random_state value and estimated the R^2 using cross-validation. Now I want to combine the top 10% of them in how to get more accurate predictions. Is there a meta-estimator that can get as input a few precomputed MLPRegressors and give consensus predictions? Can the BaggingRegressor do this job using MLPRegressors as input?
Thanks in advance for any hint. Thomas
-- ====================================================================== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
-- ====================================================================== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
-- ====================================================================== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Jacob & Sebastian, I think the best way to find out if my modeling approach works is to find a larger dataset, split it into two parts, the first one will be used as training/cross-validation set and the second as a test set, like in a real case scenario. Regarding the MLPRegressor regularization, below is my optimum setup: MLPRegressor(random_state=random_state, max_iter=400, early_stopping=True,
validation_fraction=0.2, alpha=10, hidden_layer_sizes=(10,))
This means only one hidden layer with maximum 10 neurons, alpha=10 for L2 regularization and early stopping to terminate training if validation score is not improving. I think this is a quite simple model. My final predictor is an SVR that combines 2 MLPRegressors, each one trained with different types of input data. @Sebastian You have mentioned dropout again but I could not find it in the docs: http://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPR... Maybe you are referring to another MLPRegressor implementation? I have seen a while ago another implementation you had on github. Can you clarify which one you recommend and why? Thank you both of you for your hints! best Thomas -- ====================================================================== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic email: tevang@pharm.uoa.gr tevang3@gmail.com website: https://sites.google.com/site/thomasevangelidishomepage/
If you dont have a large dataset, you can still do leave one out cross validation. On Mon, Jan 9, 2017 at 3:42 PM Thomas Evangelidis <tevang3@gmail.com> wrote:
Jacob & Sebastian,
I think the best way to find out if my modeling approach works is to find a larger dataset, split it into two parts, the first one will be used as training/cross-validation set and the second as a test set, like in a real case scenario.
Regarding the MLPRegressor regularization, below is my optimum setup:
MLPRegressor(random_state=random_state, max_iter=400, early_stopping=True, validation_fraction=0.2, alpha=10, hidden_layer_sizes=(10,))
This means only one hidden layer with maximum 10 neurons, alpha=10 for L2 regularization and early stopping to terminate training if validation score is not improving. I think this is a quite simple model. My final predictor is an SVR that combines 2 MLPRegressors, each one trained with different types of input data.
@Sebastian You have mentioned dropout again but I could not find it in the docs:
http://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPR...
Maybe you are referring to another MLPRegressor implementation? I have seen a while ago another implementation you had on github. Can you clarify which one you recommend and why?
Thank you both of you for your hints!
best Thomas
--
======================================================================
Thomas Evangelidis
Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr
tevang3@gmail.com
website:
https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
Even with a single layer with 10 neurons you're still trying to train over 6000 parameters using ~30 samples. Dropout is a concept common in neural networks, but doesn't appear to be in sklearn's implementation of MLPs. Early stopping based on validation performance isn't an "extra" step for reducing overfitting, it's basically a required step for neural networks. It seems like you have a validation sample of ~6 datapoints.. I'm still very skeptical of that giving you proper results for a complex model. Will this larger dataset be of exactly the same data? Just taking another unrelated dataset and showing that a MLP can learn it doesn't mean it will work for your specific data. Can you post the actual results from using LASSO, RandomForestRegressor, GradientBoostingRegressor, and MLP? On Mon, Jan 9, 2017 at 4:21 PM, Stuart Reynolds <stuart@stuartreynolds.net> wrote:
If you dont have a large dataset, you can still do leave one out cross validation.
On Mon, Jan 9, 2017 at 3:42 PM Thomas Evangelidis <tevang3@gmail.com> wrote:
Jacob & Sebastian,
I think the best way to find out if my modeling approach works is to find a larger dataset, split it into two parts, the first one will be used as training/cross-validation set and the second as a test set, like in a real case scenario.
Regarding the MLPRegressor regularization, below is my optimum setup:
MLPRegressor(random_state=random_state, max_iter=400, early_stopping=True, validation_fraction=0.2, alpha=10, hidden_layer_sizes=(10,))
This means only one hidden layer with maximum 10 neurons, alpha=10 for L2 regularization and early stopping to terminate training if validation score is not improving. I think this is a quite simple model. My final predictor is an SVR that combines 2 MLPRegressors, each one trained with different types of input data.
@Sebastian You have mentioned dropout again but I could not find it in the docs: http://scikit-learn.org/stable/modules/generated/sklearn.neural_network. MLPRegressor.html#sklearn.neural_network.MLPRegressor
Maybe you are referring to another MLPRegressor implementation? I have seen a while ago another implementation you had on github. Can you clarify which one you recommend and why?
Thank you both of you for your hints!
best Thomas
--
======================================================================
Thomas Evangelidis
Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr
tevang3@gmail.com
website:
https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Jacob, The features are not 6000. I train 2 MLPRegressors from two types of data, both refer to the same dataset (35 molecules in total) but each one contains different type of information. The first data consist of 60 features. I tried 100 different random states and measured the average |R| using the leave-20%-out cross-validation. Below are the results from the first data: RandomForestRegressor: |R|= 0.389018243545 +- 0.252891783658 LASSO: |R|= 0.247411754937 +- 0.232325286471 GradientBoostingRegressor: |R|= 0.324483769202 +- 0.211778410841 MLPRegressor: |R|= 0.540528696597 +- 0.255714448793 The second type of data consist of 456 features. Below are the results for these too: RandomForestRegressor: |R|= 0.361562548904 +- 0.234872385318 LASSO: |R|= 3.27752711304e-16 +- 2.60800139195e-16 GradientBoostingRegressor: |R|= 0.328087138161 +- 0.229588427086 MLPRegressor: |R|= 0.455473342507 +- 0.24579081197 At the end I want to combine models created from these data types using a meta-estimator (that was my original question). The combination with the highest |R| (0.631851796403 +- 0.247911204514) was produced by an SVR that combined the best MLPRegressor from data type 1 and the best MLPRegressor from data type2: On 10 January 2017 at 01:36, Jacob Schreiber <jmschreiber91@gmail.com> wrote:
Even with a single layer with 10 neurons you're still trying to train over 6000 parameters using ~30 samples. Dropout is a concept common in neural networks, but doesn't appear to be in sklearn's implementation of MLPs. Early stopping based on validation performance isn't an "extra" step for reducing overfitting, it's basically a required step for neural networks. It seems like you have a validation sample of ~6 datapoints.. I'm still very skeptical of that giving you proper results for a complex model. Will this larger dataset be of exactly the same data? Just taking another unrelated dataset and showing that a MLP can learn it doesn't mean it will work for your specific data. Can you post the actual results from using LASSO, RandomForestRegressor, GradientBoostingRegressor, and MLP?
On Mon, Jan 9, 2017 at 4:21 PM, Stuart Reynolds <stuart@stuartreynolds.net
wrote:
If you dont have a large dataset, you can still do leave one out cross validation.
On Mon, Jan 9, 2017 at 3:42 PM Thomas Evangelidis <tevang3@gmail.com> wrote:
Jacob & Sebastian,
I think the best way to find out if my modeling approach works is to find a larger dataset, split it into two parts, the first one will be used as training/cross-validation set and the second as a test set, like in a real case scenario.
Regarding the MLPRegressor regularization, below is my optimum setup:
MLPRegressor(random_state=random_state, max_iter=400, early_stopping=True, validation_fraction=0.2, alpha=10, hidden_layer_sizes=(10,))
This means only one hidden layer with maximum 10 neurons, alpha=10 for L2 regularization and early stopping to terminate training if validation score is not improving. I think this is a quite simple model. My final predictor is an SVR that combines 2 MLPRegressors, each one trained with different types of input data.
@Sebastian You have mentioned dropout again but I could not find it in the docs: http://scikit-learn.org/stable/modules/generated/sklearn. neural_network.MLPRegressor.html#sklearn.neural_network.MLPRegressor
Maybe you are referring to another MLPRegressor implementation? I have seen a while ago another implementation you had on github. Can you clarify which one you recommend and why?
Thank you both of you for your hints!
best Thomas
--
======================================================================
Thomas Evangelidis
Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr
tevang3@gmail.com
website:
https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
-- ====================================================================== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic email: tevang@pharm.uoa.gr tevang3@gmail.com website: https://sites.google.com/site/thomasevangelidishomepage/
Thomas, Jacob's point is important -- its not the number of features that's important, its the number of free parameters. As the number of free parameters increases, the space of representable functions grows to the point where the cost function is minimized by having a single parameter explain each variable. This is true of many ML methods. In the case of a decision trees, for example you can allow each node (a free parameter) hold exactly 1 training example, and see perfect training performance. In linear methods, you can perfectly fit training data by adding additional polynomial features (for feature x_i, add x^2_i, x^3_i, x^4_i, ....) Performance on unseen data will be terrible. MLP is no different -- adding more free parameters (more flexibility to precisely model the training data) may harm more than help when it comes to unseen data performance, especially when the number of examples it small. Early stopping may help overfitting, as might dropout. The likely reasons that LASSO and GBR performed well is that they're methods that explicit manage overfitting. Perform a grid search on: - the number of hidden nodes in you MLP. - the number of iterations for both, you may find lowering values will improve performance on unseen data. On Tue, Jan 10, 2017 at 4:46 AM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Jacob,
The features are not 6000. I train 2 MLPRegressors from two types of data, both refer to the same dataset (35 molecules in total) but each one contains different type of information. The first data consist of 60 features. I tried 100 different random states and measured the average |R| using the leave-20%-out cross-validation. Below are the results from the first data:
RandomForestRegressor: |R|= 0.389018243545 +- 0.252891783658 LASSO: |R|= 0.247411754937 +- 0.232325286471 GradientBoostingRegressor: |R|= 0.324483769202 +- 0.211778410841 MLPRegressor: |R|= 0.540528696597 +- 0.255714448793
The second type of data consist of 456 features. Below are the results for these too:
RandomForestRegressor: |R|= 0.361562548904 +- 0.234872385318 LASSO: |R|= 3.27752711304e-16 +- 2.60800139195e-16 GradientBoostingRegressor: |R|= 0.328087138161 +- 0.229588427086 MLPRegressor: |R|= 0.455473342507 +- 0.24579081197
At the end I want to combine models created from these data types using a meta-estimator (that was my original question). The combination with the highest |R| (0.631851796403 +- 0.247911204514) was produced by an SVR that combined the best MLPRegressor from data type 1 and the best MLPRegressor from data type2:
On 10 January 2017 at 01:36, Jacob Schreiber <jmschreiber91@gmail.com> wrote:
Even with a single layer with 10 neurons you're still trying to train over 6000 parameters using ~30 samples. Dropout is a concept common in neural networks, but doesn't appear to be in sklearn's implementation of MLPs. Early stopping based on validation performance isn't an "extra" step for reducing overfitting, it's basically a required step for neural networks. It seems like you have a validation sample of ~6 datapoints.. I'm still very skeptical of that giving you proper results for a complex model. Will this larger dataset be of exactly the same data? Just taking another unrelated dataset and showing that a MLP can learn it doesn't mean it will work for your specific data. Can you post the actual results from using LASSO, RandomForestRegressor, GradientBoostingRegressor, and MLP?
On Mon, Jan 9, 2017 at 4:21 PM, Stuart Reynolds < stuart@stuartreynolds.net> wrote:
If you dont have a large dataset, you can still do leave one out cross validation.
On Mon, Jan 9, 2017 at 3:42 PM Thomas Evangelidis <tevang3@gmail.com> wrote:
Jacob & Sebastian,
I think the best way to find out if my modeling approach works is to find a larger dataset, split it into two parts, the first one will be used as training/cross-validation set and the second as a test set, like in a real case scenario.
Regarding the MLPRegressor regularization, below is my optimum setup:
MLPRegressor(random_state=random_state, max_iter=400, early_stopping=True, validation_fraction=0.2, alpha=10, hidden_layer_sizes=(10,))
This means only one hidden layer with maximum 10 neurons, alpha=10 for L2 regularization and early stopping to terminate training if validation score is not improving. I think this is a quite simple model. My final predictor is an SVR that combines 2 MLPRegressors, each one trained with different types of input data.
@Sebastian You have mentioned dropout again but I could not find it in the docs: http://scikit-learn.org/stable/modules/generated/sklearn.neu ral_network.MLPRegressor.html#sklearn.neural_network.MLPRegressor
Maybe you are referring to another MLPRegressor implementation? I have seen a while ago another implementation you had on github. Can you clarify which one you recommend and why?
Thank you both of you for your hints!
best Thomas
--
======================================================================
Thomas Evangelidis
Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr
tevang3@gmail.com
website:
https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
--
======================================================================
Thomas Evangelidis
Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr
tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Stuart, I didn't see LASSO performing well, especially with the second type of data. The alpha parameter probably needs adjustment with LassoCV. I don't know if you have read my previous messages on this thread, so I quote again my setting for MLPRegressor. MLPRegressor(random_state=random_state, max_iter=400, early_stopping=True, validation_fraction=0.2, alpha=10, hidden_layer_sizes=(10,)) So to sum up, I must select the lowest possible value for the following parameters: * max_iter * hidden_layer_sizes (lower than 10?) * number of features in my training data. I.e. the first type of data that consisted of 60 features are preferable from that second that consisted of 456. Is this correct? On 10 January 2017 at 19:47, Stuart Reynolds <stuart@stuartreynolds.net> wrote:
Thomas, Jacob's point is important -- its not the number of features that's important, its the number of free parameters. As the number of free parameters increases, the space of representable functions grows to the point where the cost function is minimized by having a single parameter explain each variable. This is true of many ML methods.
In the case of a decision trees, for example you can allow each node (a free parameter) hold exactly 1 training example, and see perfect training performance. In linear methods, you can perfectly fit training data by adding additional polynomial features (for feature x_i, add x^2_i, x^3_i, x^4_i, ....) Performance on unseen data will be terrible. MLP is no different -- adding more free parameters (more flexibility to precisely model the training data) may harm more than help when it comes to unseen data performance, especially when the number of examples it small.
Early stopping may help overfitting, as might dropout.
The likely reasons that LASSO and GBR performed well is that they're methods that explicit manage overfitting.
Perform a grid search on: - the number of hidden nodes in you MLP. - the number of iterations
for both, you may find lowering values will improve performance on unseen data.
On Tue, Jan 10, 2017 at 4:46 AM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Jacob,
The features are not 6000. I train 2 MLPRegressors from two types of data, both refer to the same dataset (35 molecules in total) but each one contains different type of information. The first data consist of 60 features. I tried 100 different random states and measured the average |R| using the leave-20%-out cross-validation. Below are the results from the first data:
RandomForestRegressor: |R|= 0.389018243545 +- 0.252891783658 LASSO: |R|= 0.247411754937 +- 0.232325286471 GradientBoostingRegressor: |R|= 0.324483769202 +- 0.211778410841 MLPRegressor: |R|= 0.540528696597 +- 0.255714448793
The second type of data consist of 456 features. Below are the results for these too:
RandomForestRegressor: |R|= 0.361562548904 +- 0.234872385318 LASSO: |R|= 3.27752711304e-16 +- 2.60800139195e-16 GradientBoostingRegressor: |R|= 0.328087138161 +- 0.229588427086 MLPRegressor: |R|= 0.455473342507 +- 0.24579081197
At the end I want to combine models created from these data types using a meta-estimator (that was my original question). The combination with the highest |R| (0.631851796403 +- 0.247911204514) was produced by an SVR that combined the best MLPRegressor from data type 1 and the best MLPRegressor from data type2:
On 10 January 2017 at 01:36, Jacob Schreiber <jmschreiber91@gmail.com> wrote:
Even with a single layer with 10 neurons you're still trying to train over 6000 parameters using ~30 samples. Dropout is a concept common in neural networks, but doesn't appear to be in sklearn's implementation of MLPs. Early stopping based on validation performance isn't an "extra" step for reducing overfitting, it's basically a required step for neural networks. It seems like you have a validation sample of ~6 datapoints.. I'm still very skeptical of that giving you proper results for a complex model. Will this larger dataset be of exactly the same data? Just taking another unrelated dataset and showing that a MLP can learn it doesn't mean it will work for your specific data. Can you post the actual results from using LASSO, RandomForestRegressor, GradientBoostingRegressor, and MLP?
On Mon, Jan 9, 2017 at 4:21 PM, Stuart Reynolds < stuart@stuartreynolds.net> wrote:
If you dont have a large dataset, you can still do leave one out cross validation.
On Mon, Jan 9, 2017 at 3:42 PM Thomas Evangelidis <tevang3@gmail.com> wrote:
Jacob & Sebastian,
I think the best way to find out if my modeling approach works is to find a larger dataset, split it into two parts, the first one will be used as training/cross-validation set and the second as a test set, like in a real case scenario.
Regarding the MLPRegressor regularization, below is my optimum setup:
MLPRegressor(random_state=random_state, max_iter=400, early_stopping=True, validation_fraction=0.2, alpha=10, hidden_layer_sizes=(10,))
This means only one hidden layer with maximum 10 neurons, alpha=10 for L2 regularization and early stopping to terminate training if validation score is not improving. I think this is a quite simple model. My final predictor is an SVR that combines 2 MLPRegressors, each one trained with different types of input data.
@Sebastian You have mentioned dropout again but I could not find it in the docs: http://scikit-learn.org/stable/modules/generated/sklearn.neu ral_network.MLPRegressor.html#sklearn.neural_network.MLPRegressor
Maybe you are referring to another MLPRegressor implementation? I have seen a while ago another implementation you had on github. Can you clarify which one you recommend and why?
Thank you both of you for your hints!
best Thomas
--
======================================================================
Thomas Evangelidis
Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr
tevang3@gmail.com
website:
https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
--
======================================================================
Thomas Evangelidis
Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr
tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
-- ====================================================================== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic email: tevang@pharm.uoa.gr tevang3@gmail.com website: https://sites.google.com/site/thomasevangelidishomepage/
Hi, Thomas, I was just reading through a recent preprint (Protein-Ligand Scoring with Convolutional Neural Networks, https://arxiv.org/abs/1612.02751), and I thought that may be related to your task and maybe interesting or even useful for your work. Also check out references 13, 21, 22, and 24, where they talk about alternative (the more classic) representations of protein-ligand complexes or interactions as inputs to either random forests or multi-layer perceptrons. Best, Sebastian
On Jan 10, 2017, at 7:46 AM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Jacob,
The features are not 6000. I train 2 MLPRegressors from two types of data, both refer to the same dataset (35 molecules in total) but each one contains different type of information. The first data consist of 60 features. I tried 100 different random states and measured the average |R| using the leave-20%-out cross-validation. Below are the results from the first data:
RandomForestRegressor: |R|= 0.389018243545 +- 0.252891783658 LASSO: |R|= 0.247411754937 +- 0.232325286471 GradientBoostingRegressor: |R|= 0.324483769202 +- 0.211778410841 MLPRegressor: |R|= 0.540528696597 +- 0.255714448793
The second type of data consist of 456 features. Below are the results for these too:
RandomForestRegressor: |R|= 0.361562548904 +- 0.234872385318 LASSO: |R|= 3.27752711304e-16 +- 2.60800139195e-16 GradientBoostingRegressor: |R|= 0.328087138161 +- 0.229588427086 MLPRegressor: |R|= 0.455473342507 +- 0.24579081197
At the end I want to combine models created from these data types using a meta-estimator (that was my original question). The combination with the highest |R| (0.631851796403 +- 0.247911204514) was produced by an SVR that combined the best MLPRegressor from data type 1 and the best MLPRegressor from data type2:
On 10 January 2017 at 01:36, Jacob Schreiber <jmschreiber91@gmail.com> wrote: Even with a single layer with 10 neurons you're still trying to train over 6000 parameters using ~30 samples. Dropout is a concept common in neural networks, but doesn't appear to be in sklearn's implementation of MLPs. Early stopping based on validation performance isn't an "extra" step for reducing overfitting, it's basically a required step for neural networks. It seems like you have a validation sample of ~6 datapoints.. I'm still very skeptical of that giving you proper results for a complex model. Will this larger dataset be of exactly the same data? Just taking another unrelated dataset and showing that a MLP can learn it doesn't mean it will work for your specific data. Can you post the actual results from using LASSO, RandomForestRegressor, GradientBoostingRegressor, and MLP?
On Mon, Jan 9, 2017 at 4:21 PM, Stuart Reynolds <stuart@stuartreynolds.net> wrote: If you dont have a large dataset, you can still do leave one out cross validation.
On Mon, Jan 9, 2017 at 3:42 PM Thomas Evangelidis <tevang3@gmail.com> wrote:
Jacob & Sebastian,
I think the best way to find out if my modeling approach works is to find a larger dataset, split it into two parts, the first one will be used as training/cross-validation set and the second as a test set, like in a real case scenario.
Regarding the MLPRegressor regularization, below is my optimum setup:
MLPRegressor(random_state=random_state, max_iter=400, early_stopping=True, validation_fraction=0.2, alpha=10, hidden_layer_sizes=(10,))
This means only one hidden layer with maximum 10 neurons, alpha=10 for L2 regularization and early stopping to terminate training if validation score is not improving. I think this is a quite simple model. My final predictor is an SVR that combines 2 MLPRegressors, each one trained with different types of input data.
@Sebastian You have mentioned dropout again but I could not find it in the docs: http://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPR...
Maybe you are referring to another MLPRegressor implementation? I have seen a while ago another implementation you had on github. Can you clarify which one you recommend and why?
Thank you both of you for your hints!
best Thomas
--
======================================================================
Thomas Evangelidis
Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr
tevang3@gmail.com
website:
https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
-- ====================================================================== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Hi Thomas, An example os such "dummy" meta-regressor can be seen in NNScore, which is protein-ligand scoring function (one of Sebastian's suggestions). A meta-class is implemented in Open Drug Discovery Toolkit [here: https://github.com/oddt/oddt/blob/master/oddt/scoring/__init__.py#L200], along with also suggested RF-Score and few other methods you might find useful. Actually, what NNScore does it train 1000 MLPRegressors and pick 20 best scored on PDBbind test set. An ensemble prediction is mean prediction of those best models. ---- Pozdrawiam, | Best regards, Maciek Wójcikowski maciek@wojcikowski.pl 2017-01-11 21:16 GMT+01:00 Sebastian Raschka <se.raschka@gmail.com>:
Hi, Thomas,
I was just reading through a recent preprint (Protein-Ligand Scoring with Convolutional Neural Networks, https://arxiv.org/abs/1612.02751), and I thought that may be related to your task and maybe interesting or even useful for your work. Also check out references 13, 21, 22, and 24, where they talk about alternative (the more classic) representations of protein-ligand complexes or interactions as inputs to either random forests or multi-layer perceptrons.
Best, Sebastian
On Jan 10, 2017, at 7:46 AM, Thomas Evangelidis <tevang3@gmail.com> wrote:
Jacob,
The features are not 6000. I train 2 MLPRegressors from two types of data, both refer to the same dataset (35 molecules in total) but each one contains different type of information. The first data consist of 60 features. I tried 100 different random states and measured the average |R| using the leave-20%-out cross-validation. Below are the results from the first data:
RandomForestRegressor: |R|= 0.389018243545 +- 0.252891783658 LASSO: |R|= 0.247411754937 +- 0.232325286471 GradientBoostingRegressor: |R|= 0.324483769202 +- 0.211778410841 MLPRegressor: |R|= 0.540528696597 +- 0.255714448793
The second type of data consist of 456 features. Below are the results for these too:
RandomForestRegressor: |R|= 0.361562548904 +- 0.234872385318 LASSO: |R|= 3.27752711304e-16 +- 2.60800139195e-16 GradientBoostingRegressor: |R|= 0.328087138161 +- 0.229588427086 MLPRegressor: |R|= 0.455473342507 +- 0.24579081197
At the end I want to combine models created from these data types using a meta-estimator (that was my original question). The combination with the highest |R| (0.631851796403 +- 0.247911204514) was produced by an SVR that combined the best MLPRegressor from data type 1 and the best MLPRegressor from data type2:
On 10 January 2017 at 01:36, Jacob Schreiber <jmschreiber91@gmail.com> wrote: Even with a single layer with 10 neurons you're still trying to train over 6000 parameters using ~30 samples. Dropout is a concept common in neural networks, but doesn't appear to be in sklearn's implementation of MLPs. Early stopping based on validation performance isn't an "extra" step for reducing overfitting, it's basically a required step for neural networks. It seems like you have a validation sample of ~6 datapoints.. I'm still very skeptical of that giving you proper results for a complex model. Will this larger dataset be of exactly the same data? Just taking another unrelated dataset and showing that a MLP can learn it doesn't mean it will work for your specific data. Can you post the actual results from using LASSO, RandomForestRegressor, GradientBoostingRegressor, and MLP?
On Mon, Jan 9, 2017 at 4:21 PM, Stuart Reynolds < stuart@stuartreynolds.net> wrote: If you dont have a large dataset, you can still do leave one out cross validation.
On Mon, Jan 9, 2017 at 3:42 PM Thomas Evangelidis <tevang3@gmail.com> wrote:
Jacob & Sebastian,
I think the best way to find out if my modeling approach works is to find a larger dataset, split it into two parts, the first one will be used as training/cross-validation set and the second as a test set, like in a real case scenario.
Regarding the MLPRegressor regularization, below is my optimum setup:
MLPRegressor(random_state=random_state, max_iter=400, early_stopping=True, validation_fraction=0.2, alpha=10, hidden_layer_sizes=(10,))
This means only one hidden layer with maximum 10 neurons, alpha=10 for L2 regularization and early stopping to terminate training if validation score is not improving. I think this is a quite simple model. My final predictor is an SVR that combines 2 MLPRegressors, each one trained with different types of input data.
@Sebastian You have mentioned dropout again but I could not find it in the docs: http://scikit-learn.org/stable/modules/generated/sklearn.neural_network. MLPRegressor.html#sklearn.neural_network.MLPRegressor
Maybe you are referring to another MLPRegressor implementation? I have seen a while ago another implementation you had on github. Can you clarify which one you recommend and why?
Thank you both of you for your hints!
best Thomas
--
======================================================================
Thomas Evangelidis
Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr
tevang3@gmail.com
website:
https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
-- ====================================================================== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic
email: tevang@pharm.uoa.gr tevang3@gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
participants (6)
-
Jacob Schreiber -
Joel Nothman -
Maciek Wójcikowski -
Sebastian Raschka -
Stuart Reynolds -
Thomas Evangelidis