[scikit-learn] biased predictions in logistic regression
Rachel Melamed
melamed at uchicago.edu
Thu Dec 15 14:03:22 EST 2016
Thanks for the reply. The covariates (“X") are all dummy/categorical variables. So I guess no, nothing is normalized.
On Dec 15, 2016, at 1:54 PM, Alexey Dral <aadral at gmail.com<mailto:aadral at gmail.com>> wrote:
Hi Rachel,
Do you have your data normalized?
2016-12-15 20:21 GMT+03:00 Rachel Melamed <melamed at uchicago.edu<mailto:melamed at uchicago.edu>>:
Hi all,
Does anyone have any suggestions for this problem:
http://stackoverflow.com/questions/41125342/sklearn-logistic-regression-gives-biased-results
I am running around 1000 similar logistic regressions, with the same covariates but slightly different data and response variables. All of my response variables have a sparse successes (p(success) < .05 usually).
I noticed that with the regularized regression, the results are consistently biased to predict more "successes" than is observed in the training data. When I relax the regularization, this bias goes away. The bias observed is unacceptable for my use case, but the more-regularized model does seem a bit better.
Below, I plot the results for the 1000 different regressions for 2 different values of C: [results for the different regressions for 2 different values of C] <https://i.stack.imgur.com/1cbrC.png>
I looked at the parameter estimates for one of these regressions: below each point is one parameter. It seems like the intercept (the point on the bottom left) is too high for the C=1 model. [enter image description here] <https://i.stack.imgur.com/NTFOY.png>
_______________________________________________
scikit-learn mailing list
scikit-learn at python.org<mailto:scikit-learn at python.org>
https://mail.python.org/mailman/listinfo/scikit-learn
--
Yours sincerely,
Alexey A. Dral
_______________________________________________
scikit-learn mailing list
scikit-learn at python.org<mailto:scikit-learn at python.org>
https://mail.python.org/mailman/listinfo/scikit-learn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20161215/eea4bbdf/attachment-0001.html>
More information about the scikit-learn
mailing list