[scikit-learn] custom loss function in RandomForestRegressor
Thomas Evangelidis
tevang3 at gmail.com
Thu Mar 1 08:27:14 EST 2018
Hi again,
I am currently revisiting this problem after familiarizing myself with
Cython and Scikit-Learn's code and I have a very important query:
Looking at the class MSE(RegressionCriterion), the node impurity is defined
as the variance of the target values Y on that node. The predictions X are
nowhere involved in the computations. This contradicts my notion of "loss
function", which quantifies the discrepancy between predicted and target
values. Am I looking at the wrong class or what I want to do is just not
feasible with Random Forests? For example, I would like to modify the
RandomForestRegressor code to minimize the Pearson's R between predicted
and target values.
I thank you in advance for any clarification.
Thomas
>
>> On 02/15/2018 01:28 PM, Guillaume Lemaitre wrote:
>>
>> Yes you are right pxd are the header and pyx the definition. You need to
>> write a class as MSE. Criterion is an abstract class or base class (I don't
>> have it under the eye)
>>
>> @Andy: if I recall the PR, we made the classes public to enable such
>> custom criterion. However, it is not documented since we were not
>> officially supporting it. So this is an hidden feature. We could always
>> discuss to make this feature more visible and document it.
>>
>>
>>
>
--
======================================================================
Dr Thomas Evangelidis
Post-doctoral Researcher
CEITEC - Central European Institute of Technology
Masaryk University
Kamenice 5/A35/2S049,
62500 Brno, Czech Republic
email: tevang at pharm.uoa.gr
tevang3 at gmail.com
website: https://sites.google.com/site/thomasevangelidishomepage/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20180301/6261b655/attachment.html>
More information about the scikit-learn
mailing list