[scikit-learn] LASSO: Predicted values show negative correlation with observed values on random data
Martin Watzenboeck
martin.watzenboeck at gmail.com
Tue Apr 2 14:57:51 EDT 2019
Hello,
I tried to apply LASSO regression in combination with LeaveOneOut CV on my
data, and observed a significant negative correlation between predicted and
observed response values. I tried to replicate the problem using random
data (please see code below).
Anyone have an idea what I am doing wrong? I would very much like to use
LASSO regression on my data. Thanks a lot!
Cheers,
Martin
#Lasso example
from sklearn.linear_model import Lasso
from sklearn.model_selection import LeaveOneOut
from scipy.stats import pearsonr
import numpy as np
n_samples = 500
n_features = 30
#create random features
rng = np.random.RandomState(seed=42)
X = rng.randn(n_samples * n_features).reshape(n_samples, n_features)
#Create Ys
Y = rng.randn(n_samples)
#instantiate regressor and cv object
cv = LeaveOneOut()
reg = Lasso(random_state = 42)
#create arrays to save predicted (and observed) Y values
pred = np.array([])
obs = np.array([])
#run cross validation
for train, test in cv.split(X, Y):
#fit regressor
reg.fit(X[train], Y[train])
#append predicted and observed values to the arrays
pred = np.r_[pred, reg.predict(X[test])]
obs = np.r_[obs, Y[test]]
#test correlation
pearsonr(pred, obs)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190402/b27cbb9e/attachment.html>
More information about the scikit-learn
mailing list