[scikit-learn] OneClassSvm | Different results on different runs

Thu Aug 3 07:54:37 EDT 2017

@albertcthomas isn't there some randomness in SMO which could influence the
result if the tolerance parameter is too large?

On Aug 3, 2017 1:28 PM, "Albert Thomas" <albertthomas88 at gmail.com> wrote:

> Hi Abhishek,
>
> Could you provide a small code snippet? I don't think the random_state
> parameter should influence the result of the OneClassSVM as there is no
> probability estimation for this estimator.
>
> Albert
>
> On Thu, Aug 3, 2017 at 12:41 PM Jaques Grobler <jaquesgrobler at gmail.com>
> wrote:
>
>> Hi,
>>
>> The random_state parameter is used to generate a pseudo random number
>> that is used when shuffling your data for probability estimation
>>
>> The seed of the pseudo random number generator to use when shuffling the
>> data for probability estimation.
>> A seed can be provided to control the shuffling for reproducible behavior.
>>
>> Also, from the SVM docs
>> <http://scikit-learn.org/stable/modules/svm.html#svm-outlier-detection>
>>
>> The underlying LinearSVC
>>> <http://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html#sklearn.svm.LinearSVC>
>>>  implementation uses a random number generator to select features when
>>> fitting the model. It is thus not uncommon, to have slightly different
>>> results for the same input data. If that happens, try with a smaller *tol
>>> *parameter.
>>
>>
>> Hope that helps
>>
>> 2017-08-03 12:15 GMT+02:00 Abhishek Raj via scikit-learn <
>> scikit-learn at python.org>:
>>
>>> Hi,
>>>
>>> I am using one class svm for developing an anomaly detection model. I
>>> observed that different runs of training on the same data set outputs
>>> different accuracy. One run takes the accuracy as high as 98% and another
>>> run on the same data brings it down to 93%. Googling a little bit I found
>>> out that this is happening because of the random_state
>>> <http://scikit-learn.org/stable/modules/generated/sklearn.utils.check_random_state.html> parameter
>>> but I am not clear of the details.
>>>
>>> Can anyone expand on how is the parameter exactly affecting my training
>>> and how I can figure out the best value to get the model with best accuracy?
>>>
>>> Thanks,
>>> Abhishek
>>>
>>> _______________________________________________
>>> scikit-learn mailing list
>>> scikit-learn at python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>
>>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170803/58b9beca/attachment.html>