[scikit-learn] Difference in normalization between Lasso and LogisticRegression + L1

Stuart Reynolds stuart at stuartreynolds.net
Wed May 29 17:29:39 EDT 2019


I looked into like a while ago. There were differences in which algorithms
regularized the intercept, and which ones do not. (I believe liblinear
does, lbgfs does not).
All of the algorithms disagreed with logistic regression in scipy.

- Stuart

On Wed, May 29, 2019 at 10:50 AM Andreas Mueller <t3kcit at gmail.com> wrote:

> That is not very ideal indeed.
> I think we just went with what liblinear did, and when saga was introduced
> kept that behavior.
> It should probably be scaled as in Lasso, I would imagine?
>
>
> On 5/29/19 1:42 PM, Michael Eickenberg wrote:
>
> Hi Jesse,
>
> I think there was an effort to compare normalization methods on the data
> attachment term between Lasso and Ridge regression back in 2012/13, but
> this might have not been finished or extended to Logistic Regression.
>
> If it is not documented well, it could definitely benefit from a
> documentation update.
>
> As for changing it to a more consistent state, that would require adding a
> keyword argument pertaining to this functionality and, after discussion,
> possibly changing the default value after some deprecation cycles (though
> this seems like a dangerous one to change at all imho).
>
> Michael
>
>
> On Wed, May 29, 2019 at 10:38 AM Jesse Livezey <jesse.livezey at gmail.com>
> wrote:
>
>> Hi everyone,
>>
>> I noticed recently that in the Lasso implementation (and docs), the MSE
>> term is normalized by the number of samples
>>
>> https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html
>>
>> but for LogisticRegression + L1, the logloss does not seem to be
>> normalized by the number of samples. One consequence is that the strength
>> of the regularization depends on the number of samples explicitly. For
>> instance, in Lasso, if you tile a dataset N times, you will learn the same
>> coef, but in LogisticRegression, you will learn a different coef.
>>
>> Is this the intended behavior of LogisticRegression? I was surprised by
>> this. Either way, it would be helpful to document this more clearly in the
>> Logistic Regression docs (I can make a PR.)
>>
>> https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html
>>
>> Jesse
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
> _______________________________________________
> scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190529/0f1ac1a6/attachment-0001.html>


More information about the scikit-learn mailing list