<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">

</head>

<body>

<div class="" style="word-wrap:break-word">Dear Scikit-learn community,

<div class=""><br class="">

</div>

<div class=""><br class="">

</div>

<div class="">I have been reading some examples in <a href="https://scikit-learn.org/stable/auto_examples/ensemble/plot_forest_importances.html#feature-importance-based-on-mean-decrease-in-impurity" class="">https://scikit-learn.org/stable/auto_examples/ensemble/plot_forest_importances.html#feature-importance-based-on-mean-decrease-in-impurity</a> about

 the permutation importance that can be assessed after fitting a tree-based model (e.g. RandomForestClassifier).</div>

<div class=""><br class="">

</div>

<div class="">However, I have noticed a discrepancy that I would like to mention. If a one-hot-encoding step is used before model fitting, the `.<span class="x_n">feature_importances_</span>` attribute includes importances for all the levels of the transformed

 categorical features (e.g. for gender, we get 2 importances for males & females, respectively.</div>

<div class=""><br class="">

</div>

<div class="">When I apply the `<span class="x_n"><a href="https://scikit-learn.org/stable/modules/generated/sklearn.inspection.permutation_importance.html#sklearn.inspection.permutation_importance" title="sklearn.inspection.permutation_importance" class="x_sphx-glr-backref-type-py-function x_sphx-glr-backref-module-sklearn-inspection">permutation_importance</a></span>`

 functions though, the outputs correspond to the non-transformed data. To illustrate this, I include a toy example in .py format. </div>

<div class=""><br class="">

</div>

<div class="">Best,</div>

<div class="">Makis</div>

<div class=""><br class="">

</div>

<div class=""><br class="">

</div>

<div class=""><br class="">

</div>

<div class=""><br class="">

</div>

<div class=""></div>

</div>

<div class="" style="word-wrap:break-word">

<div class=""></div>

</div>

</body>

</html>