[scikit-learn] RFE with logistic regression

Tue Jul 24 17:44:58 EDT 2018

Univariate screening is somewhat hackish too, but much more stable -- 
and cheap.
Best,

Bertrand

On 24/07/2018 23:33, Benoît Presles wrote:
> So you think that I cannot get reproducible and consistent results 
> with this method ?
> If you would avoid RFE, which method do you suggest to find the best 
> features ?
>
> Ben
>
>
> Le 24/07/2018 à 21:34, Gael Varoquaux a écrit :
>> On Tue, Jul 24, 2018 at 08:43:27PM +0200, Benoît Presles wrote:
>>> 3. With C=1, it seems that I have the same results at each run for all
>>> solvers (liblinear, sag and saga), however the ranking is not the same
>>> between the solvers.
>> Your problem is probably ill-conditioned, hence the specific weights on
>> the features are not stable. There isn't a good answer to ordering
>> features, they are degenerate.
>>
>> In general, I would avoid RFE, it is a hack, and can easily lead to 
>> these
>> problems.
>>
>> Gaël
>>
>>> Thanks for your help,
>>> Ben
>>
>>> PS1: I checked and n_iter_ seems to be always lower than max_iter.
>>> PS2: my data is scaled, I am using "StandardScaler".
>>
>>
>>> Le 24/07/2018 à 20:33, Andreas Mueller a écrit :
>>
>>>> On 07/24/2018 02:07 PM, Benoît Presles wrote:
>>>>> I did the same tests as before adding fit_intercept=False and:
>>>>> 1. I have got the same problem as before, i.e. when I execute the
>>>>> RFE multiple times I don't get the same ranking each time.
>>>>> 2. When I change the solver to 'sag'
>>>>> (classifier_RFE=LogisticRegression(C=1e9, verbose=1, max_iter=10000,
>>>>> fit_intercept=False, solver='sag')), it seems that I get the same
>>>>> ranking at each run. This is not the case with the 'saga' solver.
>>>>> The ranking is not the same between the solvers.
>>>>> 3. With C=1, it seems that I have the same results at each run for
>>>>> all solvers (liblinear, sag and saga), however the ranking is not
>>>>> the same between the solvers.
>>
>>>>> How can I get reproducible and consistent results?
>>>> Did you scale your data? If not, saga and sag will basically fail.
>>>> _______________________________________________
>>>> scikit-learn mailing list
>>>> scikit-learn at python.org
>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>> _______________________________________________
>>> scikit-learn mailing list
>>> scikit-learn at python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn