[scikit-learn] Support Vector Machines: Sensitive to Single Datapoints?

Gael Varoquaux gael.varoquaux at normalesup.org
Tue Dec 19 16:35:26 EST 2017


With as few data points, there is a huge uncertainty in the estimation of
the prediction accuracy with cross-validation. This isn't a problem of
the method, is it a basic limitation of the small amount of data. I've
written a paper on this problem is the specific context of neuroimaging:
https://www.sciencedirect.com/science/article/pii/S1053811917305311
(preprint: https://hal.inria.fr/hal-01545002/).

I except that what you are seing in sampling noise: the result has
confidence intervals in large than 10%.

Gaël


On Tue, Dec 19, 2017 at 04:27:53PM -0500, Taylor, Johnmark wrote:
> Hello,

> I am a researcher in fMRI and am using SVMs to analyze brain data. I am doing
> decoding between two classes, each of which has 24 exemplars per class. I am
> comparing two different methods of cross-validation for my data: in one, I am
> training on 23 exemplars from each class, and testing on the remaining example
> from each class, and in the other, I am training on 22 exemplars from each
> class, and testing on the remaining two from each class (in case it matters,
> the data is structured into different neuroimaging "runs", with each "run"
> containing several "blocks"; the first cross-validation method is leaving out
> one block at a time, the second is leaving out one run at a time). 

> Now, I would've thought that these two CV methods would be very similar, since
> the vast majority of the training data is the same; the only difference is in
> adding two additional points. However, they are yielding very different
> results: training on 23 per class is yielding 60% decoding accuracy (averaged
> across several subjects, and statistically significantly greater than chance),
> training on 22 per class is yielding chance (50%) decoding. Leaving aside the
> particulars of fMRI in this case: is it unusual for single points (amounting to
> less than 5% of the data) to have such a big influence on SVM decoding? I am
> using a cost parameter of C=1. I must say it is counterintuitive to me that
> just a couple points out of two dozen could make such a big difference.

> Thank you very much, and cheers,

> JohnMark

> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn


-- 
    Gael Varoquaux
    Senior Researcher, INRIA Parietal
    NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
    Phone:  ++ 33-1-69-08-79-68
    http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux


More information about the scikit-learn mailing list