[scikit-learn] Decoding Differences Between SKL SVM and Matlab Libsvm Even When Parameters the Same

David Nicholson nicholdav at gmail.com
Wed Jun 22 14:23:26 EDT 2016


Did you try using the Python API to libsvm directly instead of through SKL?
I'm guessing you have it on your computer since you have the Matlab API.
That would at least let you test whether it's the fake data or whether it's
SKL.
Also are you loading the fake data from a .mat file into Python (e.g. with
the SciPy 'loadmat' function) or are you generating it from a script? Maybe
some weird floating point error between Python and Matlab is giving you the
different results? This could happen if you generate the data with a script
written in both Python and Matlab, for example... along the same lines as
the random seed generator giving different results
On Jun 22, 2016 1:27 PM, "Michael Bommarito" <michael at bommaritollc.com>
wrote:

> Did you fix the random seeds across implementations as well?  Differences
> in seeds or generators might explain this.
>
> Thanks,
> Michael J. Bommarito II, CEO
> Bommarito Consulting, LLC
> *Web:* http://www.bommaritollc.com
> *Mobile:* +1 (646) 450-3387
>
> On Wed, Jun 22, 2016 at 1:15 PM, Taylor, Johnmark <
> johnmarktaylor at g.harvard.edu> wrote:
>
>> Hello,
>>
>> I am moving much of my neuroimaging coding over to Python from Matlab and
>> so I am switching from using libsvm in Matlab to using Scikit-learn SVM in
>> Python. Just to make sure I am not changing anything substantive about my
>> analyses, I am experimenting with the two implementations and trying to see
>> whether I can get them to yield identical results.
>>
>> In Python I am using:
>>
>> clf = svm.SVC(kernel='linear',C=1,probability=True)
>>
>> In Matlab (libsvm) I am using:
>>
>> clf = libsvmtrain(svm_training_labels,svm_training_vectors,['-t 0 -b 1 -c 1'])
>>
>> When I run the SVM using these two different ways using simulated data, I
>> get subtly different results, even though I have fixed all of the
>> parameters of the SVMs to be the same using input arguments (linear
>> classifier, C=1, use probability estimates), and even though all the other
>> default parameters seem to be the same across these functions (tolerance =
>> .001, both using shrinking heuristics by default).
>>
>> To give more details regarding the simulations:
>>
>> One simulation I ran was designed to be absurdly difficult--it yielded
>> 40% accuracy for Matlab libsvm, and 44% accuracy for scikit-learn svm
>> (binary classification, chance = 50%). In this simulation, the two SVMs
>> agreed in their predictions only 18% of the time (in other words, they were
>> both not only guessing below chance, but they nearly always gave opposite
>> guesses compared to each other).
>>
>> The other simulation was easier, yielding 68% accuracy for Matlab libsvm,
>> and 67% accuracy for scikit-learn SVM. In this simulation, the two SVMs
>> agreed in their predictions 97% of the time. So even though they often got
>> it wrong, they tended to make the same wrong guesses.
>>
>> Any idea of what could possibly be leading to differences in the results?
>> My understanding is that SKL uses libsvm under the hood, so it's a been
>> confusing why the decoders are behaving differently. Both analyses are
>> being run on the same computer (Linux OS).
>>
>> Thank you very much,
>>
>> JohnMark Taylor
>>
>> PhD Student, Harvard Vision Sciences Lab
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20160622/03451b84/attachment.html>


More information about the scikit-learn mailing list