<div dir="ltr"><div>Hi everyone,</div><div><br></div><div>I noticed recently that in the Lasso implementation (and docs), the MSE term is normalized by the number of samples</div><div><a href="https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html">https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html</a></div><div><br></div><div>but for LogisticRegression + L1, the logloss does not seem to be normalized by the number of samples. One consequence is that the strength of the regularization depends on the number of samples explicitly. For instance, in Lasso, if you tile a dataset N times, you will learn the same coef, but in LogisticRegression, you will learn a different coef.</div><div><br></div><div>Is this the intended behavior of LogisticRegression? I was surprised by this. Either way, it would be helpful to document this more clearly in the Logistic Regression docs (I can make a PR.)<br></div><div><a href="https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html">https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html</a></div><div><br></div><div>Jesse<br></div></div>