[scikit-learn] biased predictions in logistic regression
Alexey Dral
aadral at gmail.com
Thu Dec 15 14:16:15 EST 2016
Could you try to normalize dataset after feature dummy encoding and see if
it is reproducible behavior?
2016-12-15 22:03 GMT+03:00 Rachel Melamed <melamed at uchicago.edu>:
> Thanks for the reply. The covariates (“X") are all dummy/categorical
> variables. So I guess no, nothing is normalized.
>
> On Dec 15, 2016, at 1:54 PM, Alexey Dral <aadral at gmail.com> wrote:
>
> Hi Rachel,
>
> Do you have your data normalized?
>
> 2016-12-15 20:21 GMT+03:00 Rachel Melamed <melamed at uchicago.edu>:
>
>> Hi all,
>> Does anyone have any suggestions for this problem:
>> http://stackoverflow.com/questions/41125342/sklearn-logistic
>> -regression-gives-biased-results
>>
>> I am running around 1000 similar logistic regressions, with the same
>> covariates but slightly different data and response variables. All of my
>> response variables have a sparse successes (p(success) < .05 usually).
>>
>> I noticed that with the regularized regression, the results are
>> consistently biased to predict more "successes" than is observed in the
>> training data. When I relax the regularization, this bias goes away. The
>> bias observed is unacceptable for my use case, but the more-regularized
>> model does seem a bit better.
>>
>> Below, I plot the results for the 1000 different regressions for 2
>> different values of C: [image: results for the different regressions for
>> 2 different values of C] <https://i.stack.imgur.com/1cbrC.png>
>>
>> I looked at the parameter estimates for one of these regressions: below
>> each point is one parameter. It seems like the intercept (the point on the
>> bottom left) is too high for the C=1 model. [image: enter image
>> description here] <https://i.stack.imgur.com/NTFOY.png>
>>
>>
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>
>
> --
> Yours sincerely,
> Alexey A. Dral
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
--
Yours sincerely,
Alexey A. Dral
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20161215/cca75903/attachment.html>
More information about the scikit-learn
mailing list