<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr">On Mon, Jan 7, 2019 at 11:50 PM pisymbol <<a href="mailto:pisymbol@gmail.com">pisymbol@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>According to the doc (0.20.2) the coef_ variables are suppose to be shape (1, n_features) for binary classification. Well I created a Pipeline and performed a GridSearchCV to create a LogisticRegresion model that does fairly well. However, when I want to rank feature importance I noticed that my coefs_ for my best_estimator_ has 24 entries while my training data has 22.</div><div><br></div><div>What am I missing? How could coef_ > n_features?<br></div><br></div></blockquote><div><br></div><div>Just a follow-up, I am using a OneHotEncoder to encode two categoricals as part of my pipeline (I am also using an imputer/standard scaler too but I don't see how that could add features).</div><div><br></div><div>Could my pipeline actually add two more features during fitting?<br></div><div><br></div><div>-aps<br></div></div></div>