[scikit-learn] Query about use of standard deviation on tree feature_importances_ in demo plot_forest_importances.html
ian at ianozsvald.com
Sat Jun 24 05:17:51 EDT 2017
Good. I'd suggested a box plot or use of IQR (on a bar chart) on the
yellowbrick list. I was assuming that if distribution of feature
importances contained many '0's might indeed be worth highlighting as
a diagnostic. Cheers, Ian.
On 23 June 2017 at 18:51, Olivier Grisel <olivier.grisel at ensta.org> wrote:
> +1 for changing this example to have error bars represent 5 & 95
> percentiles or 25 and 75 percentiles (quartiles).
> Or event bootstrapped confidence intervals or the mean feature
> importance for each variable. This might be a bit too verbose for an
> example though.
>> Perhaps more importantly - is a visual
> indication of the spread of feature importances in an ensemble
> actually a useful thing to plot? Does it serve a diagnostic value?
> Yes. Otherwise people might be over-confident in the stability of
> those feature importances.
> scikit-learn mailing list
> scikit-learn at python.org
Ian Ozsvald (Data Scientist, PyDataLondon co-chair)
ian at IanOzsvald.com
More information about the scikit-learn