[scikit-learn] cross validation scores seem off for PLSRegression
Paul Anton Letnes
pa at letnes.com
Tue Feb 14 06:27:11 EST 2017
@ is a python operator meaning "matrix multiplication".
<https://www.python.org/dev/peps/pep-0465/>
I was deliberately setting y to the prediction to make sure that the PLS model should be able to recreate the values completely and give a sensible score.
Paul
On 14 February 2017 at 12:08:11 +01:00, Fabian Böhnlein <fabian.boehnlein at gmail.com> wrote:
> Hi Paul,
>
> not sure what @ syntax does in ipython, but seems you're setting y to the coefficients of the model instead of y_hat = pls.predict(x).
>
> Also see in the documentation why R^2 can be negative: <http://scikit-learn.org/stable/modules/generated/sklearn.cross_decomposition.PLSRegression.html#sklearn.cross_decomposition.PLSRegression.score>
>
> Best,
> Fabian
>
>
> On Tue, 14 Feb 2017 at 11:57 Paul Anton Letnes <<pa at letnes.com>> wrote:
>
> > Hi!
> >
> > Versions:
> > sklearn 0.18.1
> > numpy 1.11.3
> > Anaconda python 3.5 on ubuntu 16.04
> >
> > What range is the cross_val_score supposed to be in? I was under the impression from the documentation, although I cannot find it stated explicitly anywhere, that it should be a number in the range [0, 1]. However, it appears that one can get large negative values; see the ipython session below.
> >
> > Cheers
> > Paul
> >
> > In [2]: import numpy as np
> >
> > In [3]: y = np.random.random((10, 3))
> >
> > In [4]: x = np.random.random((10, 17))
> >
> > In [5]: from sklearn.cross_decomposition import PLSRegression
> >
> > In [6]: pls = PLSRegression(n_components=3)
> >
> > In [7]: from sklearn.cross_validation import cross_val_score
> >
> > In [8]: from sklearn.model_selection import cross_val_score
> >
> > In [9]: cross_val_score(pls, x, y)
> > Out[9]: array([-32.52217837, -4.17228083, -5.88632365])
> >
> >
> > PS:
> > This happens even if I cheat by setting y to the predicted value, and cross validate on that.
> >
> > In [29]: y = x @ pls.coef_
> >
> > In [30]: cross_val_score(pls, x, y)
> > /home/paul/anaconda3/envs/wp3-paper/lib/python3.5/site-packages/sklearn/cross_decomposition/pls_.py:293: UserWarning: Y residual constant at iteration 5
> > warnings.warn('Y residual constant at iteration %s' % k)
> > /home/paul/anaconda3/envs/wp3-paper/lib/python3.5/site-packages/sklearn/cross_decomposition/pls_.py:293: UserWarning: Y residual constant at iteration 6
> > warnings.warn('Y residual constant at iteration %s' % k)
> > /home/paul/anaconda3/envs/wp3-paper/lib/python3.5/site-packages/sklearn/cross_decomposition/pls_.py:293: UserWarning: Y residual constant at iteration 6
> > warnings.warn('Y residual constant at iteration %s' % k)
> > Out[30]: array([-35.01267353, -4.94806383, -5.9619526 ])
> >
> > In [34]: np.max(np.abs(y - x @ pls.coef_))
> > Out[34]: 0.0
> >
> >
> > _______________________________________________
> > scikit-learn mailing list
> > <scikit-learn at python.org>
> >
> > <https://mail.python.org/mailman/listinfo/scikit-learn>
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170214/d391bdcb/attachment-0001.html>
More information about the scikit-learn
mailing list