[scikit-learn] Question about Python's L2-Regularized Logistic Regression

Sebastian Raschka se.raschka at gmail.com
Thu Sep 29 18:20:39 EDT 2016


Hi, Kristen,
there shouldn’t be any internal feature selection going on behind the scenes. You may want to compare the weight coefficients of your regularized vs unregularized model, if they are exactly the same, then this would be an indicator that something funny is going on. Otherwise, it could be that both strongly- and non-regularized models are equally good or bad models on that dataset (btw. what value do you get for the ROC auc?).

You can access the weight coefficients via the “coef_” attribute after fitting. I.e.,

lr = LogisticRegression(...)
lr.fit(X_train, y_train)
lr.coef_

> Should I be coding my predictors as +1/-1? 

0 and 1 should be just fine and is the expected default. 

Best,
Sebastian

> On Sep 29, 2016, at 6:09 PM, Kristen M. Altenburger <kaltenb at stanford.edu> wrote:
> 
> Hi All,
> 
> I am trying to understand Python’s code [function ‘_fit_liblinear' in https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/svm/base.py] for fitting an L2-logistic regression for a ‘liblinear’ solver. More specifically, my [approximately balanced class] dataset is such that the # of predictors [p=2000] >> # of observations [n=100]. Therefore, I am currently confused that when I increase C [and thus decrease the regularization strength] in fitting the logistic regression model to my training data why I then still obtain high AUC results when the model is then applied to my testing data. Is python internally doing a feature selection when fitting this model for high C values? Or why is it that the almost unregularized model [high C values] versus regularized [cross-validated approach to selecting C] model both result in similar AUC and accuracy results when the model is applied to the testing data? Should I be coding my predictors as +1/-1? 
> 
> Any pointers/explanations would be much appreciated!
> 
> Thanks,
> Kristen
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn



More information about the scikit-learn mailing list