[scikit-learn] LogisticRegression coef_ greater than n_features?

Tue Jan 8 00:02:17 EST 2019

On Mon, Jan 7, 2019 at 11:50 PM pisymbol <pisymbol at gmail.com> wrote:

> According to the doc (0.20.2) the coef_ variables are suppose to be shape
> (1, n_features) for binary classification. Well I created a Pipeline and
> performed a GridSearchCV to create a LogisticRegresion model that does
> fairly well. However, when I want to rank feature importance I noticed that
> my coefs_ for my best_estimator_ has 24 entries while my training data has
> 22.
>
> What am I missing? How could coef_ > n_features?
>
>
Just a follow-up, I am using a OneHotEncoder to encode two categoricals as
part of my pipeline (I am also using an imputer/standard scaler too but I
don't see how that could add features).

Could my pipeline actually add two more features during fitting?

-aps
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190108/05018e60/attachment.html>