[scikit-learn] logistic regression results are not stable between solvers
Benoît Presles
benoit.presles at u-bourgogne.fr
Wed Jan 8 14:45:59 EST 2020
Dear sklearn users,
I still have some issues concerning logistic regression.
I did compare on the same data (simulated data) sklearn with three
different solvers (lbfgs, saga, liblinear) and statsmodels.
When everything goes well, I get the same results between lbfgs, saga,
liblinear and statsmodels. When everything goes wrong, all the results
are different.
In fact, when everything goes wrong, statsmodels gives me a convergence
warning (Warning: Maximum number of iterations has been exceeded.
Current function value: inf Iterations: 20000) + an error
(numpy.linalg.LinAlgError: Singular matrix).
Why sklearn does not tell me anything? How can I know that I have
convergence issues with sklearn?
Thanks for your help,
Best regards,
Ben
--------------------------------------------
Here is the code I used to generate synthetic data:
from sklearn.datasets import make_classification
from sklearn.model_selection import StratifiedShuffleSplit
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
import statsmodels.api as sm
#
RANDOM_SEED = 2
#
X_sim, y_sim = make_classification(n_samples=200,
n_features=20,
n_informative=10,
n_redundant=0,
n_repeated=0,
n_classes=2,
n_clusters_per_class=1,
random_state=RANDOM_SEED,
shuffle=False)
#
sss = StratifiedShuffleSplit(n_splits=10, test_size=0.2,
random_state=RANDOM_SEED)
for train_index_split, test_index_split in sss.split(X_sim, y_sim):
X_split_train, X_split_test = X_sim[train_index_split],
X_sim[test_index_split]
y_split_train, y_split_test = y_sim[train_index_split],
y_sim[test_index_split]
ss = StandardScaler()
X_split_train = ss.fit_transform(X_split_train)
X_split_test = ss.transform(X_split_test)
#
classifier_lbfgs = LogisticRegression(fit_intercept=True,
max_iter=20000000, verbose=0, random_state=RANDOM_SEED, C=1e9,
solver='lbfgs', penalty='none',
tol=1e-6)
classifier_lbfgs.fit(X_split_train, y_split_train)
print('classifier lbfgs iter:', classifier_lbfgs.n_iter_)
print(classifier_lbfgs.intercept_)
print(classifier_lbfgs.coef_)
#
classifier_saga = LogisticRegression(fit_intercept=True,
max_iter=20000000, verbose=0, random_state=RANDOM_SEED, C=1e9,
solver='saga', penalty='none',
tol=1e-6)
classifier_saga.fit(X_split_train, y_split_train)
print('classifier saga iter:', classifier_saga.n_iter_)
print(classifier_saga.intercept_)
print(classifier_saga.coef_)
#
classifier_liblinear = LogisticRegression(fit_intercept=True,
max_iter=20000000, verbose=0, random_state=RANDOM_SEED,
C=1e9,
solver='liblinear',
penalty='l2', tol=1e-6)
classifier_liblinear.fit(X_split_train, y_split_train)
print('classifier liblinear iter:', classifier_liblinear.n_iter_)
print(classifier_liblinear.intercept_)
print(classifier_liblinear.coef_)
# statsmodels
logit = sm.Logit(y_split_train, sm.tools.add_constant(X_split_train))
logit_res = logit.fit(maxiter=20000)
print("Coef statsmodels")
print(logit_res.params)
On 11/10/2019 15:42, Andreas Mueller wrote:
>
>
> On 10/10/19 1:14 PM, Benoît Presles wrote:
>>
>> Thanks for your answers.
>>
>> On my real data, I do not have so many samples. I have a bit more
>> than 200 samples in total and I also would like to get some results
>> with unpenalized logisitic regression.
>> What do you suggest? Should I switch to the lbfgs solver?
> Yes.
>> Am I sure that with this solver I will not have any convergence issue
>> and always get the good result? Indeed, I did not get any convergence
>> warning with saga, so I thought everything was fine. I noticed some
>> issues only when I decided to test several solvers. Without comparing
>> the results across solvers, how to be sure that the optimisation goes
>> well? Shouldn't scikit-learn warn the user somehow if it is not the case?
> We should attempt to warn in the SAGA solver if it doesn't converge.
> That it doesn't raise a convergence warning should probably be
> considered a bug.
> It uses the maximum weight change as a stopping criterion right now.
> We could probably compute the dual objective once in the end to see if
> we converged, right? Or is that not possible with SAGA? If not, we
> might want to caution that no convergence warning will be raised.
>
>>
>> At last, I was using saga because I also wanted to do some feature
>> selection by using l1 penalty which is not supported by lbfgs...
> You can use liblinear then.
>
>
>>
>> Best regards,
>> Ben
>>
>>
>> Le 09/10/2019 à 23:39, Guillaume Lemaître a écrit :
>>> Ups I did not see the answer of Roman. Sorry about that. It is
>>> coming back to the same conclusion :)
>>>
>>> On Wed, 9 Oct 2019 at 23:37, Guillaume Lemaître
>>> <g.lemaitre58 at gmail.com <mailto:g.lemaitre58 at gmail.com>> wrote:
>>>
>>> Uhm actually increasing to 10000 samples solve the convergence
>>> issue.
>>> SAGA is not designed to work with a so small sample size most
>>> probably.
>>>
>>> On Wed, 9 Oct 2019 at 23:36, Guillaume Lemaître
>>> <g.lemaitre58 at gmail.com <mailto:g.lemaitre58 at gmail.com>> wrote:
>>>
>>> I slightly change the bench such that it uses pipeline and
>>> plotted the coefficient:
>>>
>>> https://gist.github.com/glemaitre/8fcc24bdfc7dc38ca0c09c56e26b9386
>>>
>>> I only see one of the 10 splits where SAGA is not
>>> converging, otherwise the coefficients
>>> look very close (I don't attach the figure here but they can
>>> be plotted using the snippet).
>>> So apart from this second split, the other differences seems
>>> to be numerical instability.
>>>
>>> Where I have some concern is regarding the convergence rate
>>> of SAGA but I have no
>>> intuition to know if this is normal or not.
>>>
>>> On Wed, 9 Oct 2019 at 23:22, Roman Yurchak
>>> <rth.yurchak at gmail.com <mailto:rth.yurchak at gmail.com>> wrote:
>>>
>>> Ben,
>>>
>>> I can confirm your results with penalty='none' and
>>> C=1e9. In both cases,
>>> you are running a mostly unpenalized logisitic
>>> regression. Usually
>>> that's less numerically stable than with a small
>>> regularization,
>>> depending on the data collinearity.
>>>
>>> Running that same code with
>>> - larger penalty ( smaller C values)
>>> - or larger number of samples
>>> yields for me the same coefficients (up to some
>>> tolerance).
>>>
>>> You can also see that SAGA convergence is not good by
>>> the fact that it
>>> needs 196000 epochs/iterations to converge.
>>>
>>> Actually, I have often seen convergence issues with SAG
>>> on small
>>> datasets (in unit tests), not fully sure why.
>>>
>>> --
>>> Roman
>>>
>>> On 09/10/2019 22:10, serafim loukas wrote:
>>> > The predictions across solver are exactly the same
>>> when I run the code.
>>> > I am using 0.21.3 version. What is yours?
>>> >
>>> >
>>> > In [13]: import sklearn
>>> >
>>> > In [14]: sklearn.__version__
>>> > Out[14]: '0.21.3'
>>> >
>>> >
>>> > Serafeim
>>> >
>>> >
>>> >
>>> >> On 9 Oct 2019, at 21:44, Benoît Presles
>>> <benoit.presles at u-bourgogne.fr
>>> <mailto:benoit.presles at u-bourgogne.fr>
>>> >> <mailto:benoit.presles at u-bourgogne.fr
>>> <mailto:benoit.presles at u-bourgogne.fr>>> wrote:
>>> >>
>>> >> (y_pred_lbfgs==y_pred_saga).all() == False
>>> >
>>> >
>>> > _______________________________________________
>>> > scikit-learn mailing list
>>> > scikit-learn at python.org <mailto:scikit-learn at python.org>
>>> > https://mail.python.org/mailman/listinfo/scikit-learn
>>> >
>>>
>>> _______________________________________________
>>> scikit-learn mailing list
>>> scikit-learn at python.org <mailto:scikit-learn at python.org>
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>
>>>
>>>
>>> --
>>> Guillaume Lemaitre
>>> Scikit-learn @ Inria Foundation
>>> https://glemaitre.github.io/
>>>
>>>
>>>
>>> --
>>> Guillaume Lemaitre
>>> Scikit-learn @ Inria Foundation
>>> https://glemaitre.github.io/
>>>
>>>
>>>
>>> --
>>> Guillaume Lemaitre
>>> Scikit-learn @ Inria Foundation
>>> https://glemaitre.github.io/
>>>
>>> _______________________________________________
>>> scikit-learn mailing list
>>> scikit-learn at python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20200108/a35678d0/attachment-0001.html>
More information about the scikit-learn
mailing list