On Mon, Feb 27, 2017 at 10:13:04PM +0000, Ludovico Coletta wrote:
The data is stored in a numpy array (shape: 68, 24). We are using scikit 18.1
I saw that I wrote something wrong in previous email. Your solution is indeed correct if we leave Scikit decide how to manage the inner loop. This is what we did at the beginning. By doing so, we noticed that the classifier's perfomance decrease (in comparison to a non-optimised classifier).
With 68 samples, it is not that surprising the model-selection with cross-validation is not able to select a good model. We found the same problem in brain imaging data [1], and it's an intrinsic problem due to small sample sizes: cross-validation is just not very accurate in these settings. Gaƫl [1] https://arxiv.org/abs/1606.05201