[scikit-learn] Why is cross_val_predict discouraged?
Boris Hollas
hollas at informatik.htw-dresden.de
Wed Apr 3 06:28:22 EDT 2019
I use
sum((cross_val_predict(model, X, y) - y)**2) / len(y) (*)
to evaluate the performance of a model. This conforms with Murphy:
Machine Learning, section 6.5.3, and Hastie et al: The Elements of
Statistical Learning, eq. 7.48. However, according to the documentation
of cross_val_predict, "it is not appropriate to pass these predictions
into an evaluation metric". While it is obvious that cross_val_predict
is different from cross_val_score, I don't see what should be wrong with
(*).
Also, the explanation that "|cross_val_predict|
<https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.cross_val_predict.html#sklearn.model_selection.cross_val_predict>simply
returns the labels (or probabilities)" is unclear, if not wrong. As I
understand it, this function returns estimates and no labels or
probabilities.
Regards, Boris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190403/777aae79/attachment.html>
More information about the scikit-learn
mailing list