[scikit-learn] Gradient Boosting: Feature Importances do not sum to 1

Raphael C drraph at gmail.com
Wed Aug 31 02:28:29 EDT 2016


Can you provide a reproducible example?
Raphael

On Wednesday, August 31, 2016, Douglas Chan <douglas.chan at ieee.org> wrote:

> Hello everyone,
>
> I notice conditions when Feature Importance values do not add up to 1 in
> ensemble tree methods, like Gradient Boosting Trees or AdaBoost Trees.  I
> wonder if there’s a bug in the code.
>
> This error occurs when the ensemble has a large number of estimators.  The
> exact conditions depend variously.  For example, the error shows up sooner
> with a smaller amount of training samples.  Or, if the depth of the tree is
> large.
>
> When this error appears, the predicted value seems to have converged.  But
> it’s unclear if the error is causing the predicted value not to change with
> more estimators.  In fact, the feature importance sum goes lower and lower
> with more estimators thereafter.
>
> I wonder if we’re hitting some floating point calculation error.
>
> Looking forward to hear your thoughts on this.
>
> Thank you!
> -Doug
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20160831/eb1b7a71/attachment.html>


More information about the scikit-learn mailing list