[scikit-learn] Gradient Boosting: Feature Importances do not sum to 1
Raphael C
drraph at gmail.com
Wed Aug 31 02:28:29 EDT 2016
Can you provide a reproducible example?
Raphael
On Wednesday, August 31, 2016, Douglas Chan <douglas.chan at ieee.org> wrote:
> Hello everyone,
>
> I notice conditions when Feature Importance values do not add up to 1 in
> ensemble tree methods, like Gradient Boosting Trees or AdaBoost Trees. I
> wonder if there’s a bug in the code.
>
> This error occurs when the ensemble has a large number of estimators. The
> exact conditions depend variously. For example, the error shows up sooner
> with a smaller amount of training samples. Or, if the depth of the tree is
> large.
>
> When this error appears, the predicted value seems to have converged. But
> it’s unclear if the error is causing the predicted value not to change with
> more estimators. In fact, the feature importance sum goes lower and lower
> with more estimators thereafter.
>
> I wonder if we’re hitting some floating point calculation error.
>
> Looking forward to hear your thoughts on this.
>
> Thank you!
> -Doug
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20160831/eb1b7a71/attachment.html>
More information about the scikit-learn
mailing list