[scikit-learn] Scikit learn GridSearchCV fit method ValueError Found array with 0 sample
Michał Nowotka
mmmnow at gmail.com
Fri Jul 8 11:22:05 EDT 2016
Hi,
Sorry for cross posting
(http://stackoverflow.com/questions/38263933/scikit-learn-gridsearchcv-fit-method-valueerror-found-array-with-0-sample)
but I don't know where is better to get help with my problem.
I'm working on a VM with Jupyter notebook server installed.
>From time to time I add new notebooks and reevaluate old ones to see
if they still work.
This notebook stopped working due to some changes in scikit-learn API
and some parameters become obsolete:
https://github.com/chembl/mychembl/blob/master/ipython_notebooks/10_myChEMBL_machine_learning.ipynb
I've created a corrected version of the notebook here:
https://gist.github.com/anonymous/676c55cc501ffa48fecfcc1e1252d433
But I'm stuck in cell 36 on this code:
from sklearn.cross_validation import KFold
from sklearn.grid_search import GridSearchCV
X_traina, X_testa, y_traina, y_testa =
cross_validation.train_test_split(x, y, test_size=0.95,
random_state=23)
params = {'min_samples_split': [8], 'max_depth': [20],
'min_samples_leaf': [1],'n_estimators':[200]}
cv = KFold(n=len(X_traina),n_folds=10,shuffle=True)
cv_stratified = StratifiedKFold(y_traina, n_folds=5)
gs = GridSearchCV(custom_forest, params, cv=cv_stratified,verbose=1,refit=True)
gs.fit(X_traina,y_traina)
This gives me:
ValueError: Found array with 0 sample(s) (shape=(0, 491)) while a
minimum of 1 is required.
Now I don't understand this because when I print shapes of the samples:
print (X_traina.shape, X_testa.shape, y_traina.shape, y_testa.shape)
I'm getting:
((78, 491), (1489, 491), (78,), (1489,))
Interestingly, if I change the test_size parameter to 0.88 (like in
the example corrected notebook) it works and this is the highest value
where it works. For this value, the shapes are:
((188, 491), (1379, 491), (188,), (1379,))
So the question is - what should I change in my code to make it work
for test_size set to 0.95 as well?
Kind regards,
Michal Nowotka
More information about the scikit-learn
mailing list