numpy/scipy: correlation
sturlamolden
sturlamolden at yahoo.no
Sun Nov 12 16:32:05 CET 2006
Robert Kern wrote:
> The difference between the two models is that the first places no restrictions
> on the distribution of x. The second does; both the x and y marginal
> distributions need to be normal. Under the first model, the correlation
> coefficient has no meaning.
That is not correct. The correlation coefficient is meaningful in both
models, but must be interpreted differently. However, in both cases a
correlation coefficient of 1 or -1 indicates an exact linear
relationship between x and y.
Under the first model ("linear regression"), the squared correlation
coefficient is the "explained variance", i.e. the the proportion of y's
variance accounted for by the model y = m*x + o.
Under the second model ("multivariate normal distribution"), the
correlation coefficient is the covariance of y and x divided by the
product of the standard deviations, cov(x,y)/(std(x)*std(y)).
More information about the Python-list
mailing list