[scikit-learn] A custom loss function for GradientBoostingRegressor
Zygmunt Zając
zajac.zygmunt at gmail.com
Mon Mar 20 13:45:43 EDT 2017
Hello,
I would like to add a custom loss function for gradient boosting
regression. The function is similar to least squares, except that for
each example it is OK to either undershoot or overshoot the target -
loss is zero then. There is an additional binary indicator called
"under" telling us whether it is OK to undershoot or overshoot. For
example:
y under p loss
5 1 4 0
5 0 4 1
5 1 6 1
5 0 6 0
Below is my attempt at implementation. I have three questions:
1. Is it correct?
2. How would you pass "under" to the loss function?
3. Functions other than LeastSquaresError() seem to
_update_terminal_regions_. Is this necessary in this case, and if so,
how to do it?
def __call__(self, y, pred, sample_weight=None):
if sample_weight is None:
squares = (y - pred.ravel()) ** 2.0
# the custom part
overshoot_ok = (pred > y) & (under == 0)
undershoot_ok = (pred < y) & (under == 1)
squares[overshoot_ok] = 0
squares[undershoot_ok] = 0
return np.mean(squares)
else:
(...)
def negative_gradient(self, y, pred, **kargs):
diffs = y - pred.ravel()
overshoot_ok = (pred > y) & (under == 0)
undershoot_ok = (pred < y) & (under == 1)
diffs[overshoot_ok] = 0
diffs[undershoot_ok] = 0
return diffs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170320/3c06e9b7/attachment.html>
More information about the scikit-learn
mailing list