[scikit-learn] no positive predictions by neural_network.MLPClassifier

Thu Dec 8 10:21:42 EST 2016

> Besides that my goal is not to make one MLPClassifier using a specific training set, but rather to write a program that can take as input various training sets each time and and train a neural network that will classify a given test set. Therefore, unless I didn't understand your points, working with 3 arbitrary random_state values on my current training set in order to find one value to yield good predictions, wont solve my problem.

Unfortunately, there’s no silver regarding default hyperparameter values that works across all training sets. Here, random state may also be considered a hyperparam, since you don’t have a convex cost function in MLP, it may or may not get stuck in different local minima depending on your random weight initialization.

>  I cannot find any parameter to control the epoch

You can control the maximum number of iterations via the max_iter parameter. I don’t know though whether one iteration is equal to one epoch (pass over the training set) for minibatch training in this particular implementation.

>  or measure the cost in sklearn.neural_network.MLPClassifier

The cost of the last iteration is available via the  loss_ attribute:

mlp = MLPClassifier(…)
# after training:
mlp.loss_

> On Dec 8, 2016, at 9:55 AM, Thomas Evangelidis <tevang3 at gmail.com> wrote:
> 
> Hello Sebastian,
> 
> I did normalization of my training set and used the same mean and stdev values to normalize my test set, instead of calculating means and stdev from the test set. I did that because my training set size is finite and the value of each feature is a descriptor that is characteristic of the 3D shape of the observation. The test set would definitely have different mean and stdev values from the training set, and if I had used them to normalize it then I believe I would have distorted the original descriptor values. Anyway, after this normalization I don't get 0 positive predictions anymore by the MLPClassifier.
> 
> I still don't understand your second suggestion. I cannot find any parameter to control the epoch or measure the cost in sklearn.neural_network.MLPClassifier. Do you suggest to use your own classes from github instead?
> Besides that my goal is not to make one MLPClassifier using a specific training set, but rather to write a program that can take as input various training sets each time and and train a neural network that will classify a given test set. Therefore, unless I didn't understand your points, working with 3 arbitrary random_state values on my current training set in order to find one value to yield good predictions, wont solve my problem.
> 
> best
> Thomas
> 
> 
> 
> On 8 December 2016 at 01:19, Sebastian Raschka <se.raschka at gmail.com> wrote:
> Hi, Thomas,
> we had a related thread on the email list some time ago, let me post it for reference further below. Regarding your question, I think you may want make sure that you standardized the features (which makes the learning generally it less sensitive to learning rate and random weight initialization). However, even then, I would try at least 1-3 different random seeds and look at the cost vs time — what can happen is that you land in different minima depending on the weight initialization as demonstrated in the example below (in MLPs you have the problem of a complex cost surface).
> 
> Best,
> Sebastian
> 
>> The default is set 100 units in the hidden layer, but theoretically, it should work with 2 hidden logistic units (I think that’s the typical textbook/class example). I think what happens is that it gets stuck in local minima depending on the random weight initialization. E.g., the following works just fine:
>> 
>> from sklearn.neural_network import MLPClassifier
>> X = [[0, 0], [0, 1], [1, 0], [1, 1]]
>> y = [0, 1, 1, 0]
>> clf = MLPClassifier(solver='lbfgs', 
>>                     activation='logistic', 
>>                     alpha=0.0, 
>>                     hidden_layer_sizes=(2,),
>>                     learning_rate_init=0.1,
>>                     max_iter=1000,
>>                     random_state=20)
>> clf.fit(X, y)  
>> res = clf.predict([[0, 0], [0, 1], [1, 0], [1, 1]])
>> print(res)
>> print(clf.loss_)
>> 
>> 
>> but changing the random seed to 1 leads to:
>> 
>> [0 1 1 1]
>> 0.34660921283
>> 
>> For comparison, I used a more vanilla MLP (1 hidden layer with 2 units and logistic activation as well; https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch12/ch12.ipynb), essentially resulting in the same problem:
>> <Unknown-1.png><Unknown-2.png>
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>> On Dec 7, 2016, at 6:45 PM, Thomas Evangelidis <tevang3 at gmail.com> wrote:
>> 
>> I tried the sklearn.neural_network.MLPClassifier with the default parameters using the input data I quoted in my previous post about Nu-Support Vector Classifier. The predictions are great but the problem is that sometimes when I rerun the MLPClassifier it predicts no positive observations (class 1). I have noticed that this can be controlled by the random_state parameter, e.g. MLPClassifier(random_state=0) gives always no positive predictions. My question is how can I chose the right random_state value in a real blind test case?
>> 
>> thanks in advance
>> Thomas
>> 
>> 
>> -- 
>> ======================================================================
>> Thomas Evangelidis
>> Research Specialist
>> CEITEC - Central European Institute of Technology
>> Masaryk University
>> Kamenice 5/A35/1S081, 
>> 62500 Brno, Czech Republic 
>> 
>> email: tevang at pharm.uoa.gr
>>          	tevang3 at gmail.com
>> 
>> website: https://sites.google.com/site/thomasevangelidishomepage/
>> 
>> 
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
> 
> 
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
> 
> 
> 
> 
> -- 
> ======================================================================
> Thomas Evangelidis
> Research Specialist
> CEITEC - Central European Institute of Technology
> Masaryk University
> Kamenice 5/A35/1S081, 
> 62500 Brno, Czech Republic 
> 
> email: tevang at pharm.uoa.gr
>          	tevang3 at gmail.com
> 
> website: https://sites.google.com/site/thomasevangelidishomepage/
> 
> 
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn