<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
</head>
<body>
<div class="" style="word-wrap:break-word">Dear Scikit-learn community,
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">I have been reading some examples in <a href="https://scikit-learn.org/stable/auto_examples/ensemble/plot_forest_importances.html#feature-importance-based-on-mean-decrease-in-impurity" class="">https://scikit-learn.org/stable/auto_examples/ensemble/plot_forest_importances.html#feature-importance-based-on-mean-decrease-in-impurity</a> about
the permutation importance that can be assessed after fitting a tree-based model (e.g. RandomForestClassifier).</div>
<div class=""><br class="">
</div>
<div class="">However, I have noticed a discrepancy that I would like to mention. If a one-hot-encoding step is used before model fitting, the `.<span class="x_n">feature_importances_</span>` attribute includes importances for all the levels of the transformed
categorical features (e.g. for gender, we get 2 importances for males & females, respectively.</div>
<div class=""><br class="">
</div>
<div class="">When I apply the `<span class="x_n"><a href="https://scikit-learn.org/stable/modules/generated/sklearn.inspection.permutation_importance.html#sklearn.inspection.permutation_importance" title="sklearn.inspection.permutation_importance" class="x_sphx-glr-backref-type-py-function x_sphx-glr-backref-module-sklearn-inspection">permutation_importance</a></span>`
functions though, the outputs correspond to the non-transformed data. To illustrate this, I include a toy example in .py format. </div>
<div class=""><br class="">
</div>
<div class="">Best,</div>
<div class="">Makis</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class=""></div>
</div>
<div class="" style="word-wrap:break-word">
<div class=""></div>
</div>
</body>
</html>