[SciPy-User] Strange behaviour from corrcoef when calculating correlation-matrix in SciPy/NumPy.
eat
e.antero.tammi at gmail.com
Thu Mar 3 15:18:31 EST 2011
Hi,
On Thu, Mar 3, 2011 at 11:44 AM, Pauli Virtanen <pav at iki.fi> wrote:
> Hi,
>
> Wed, 02 Mar 2011 14:36:18 -0500, josef.pktd wrote:
> [clip]
> >> The Matlab convention
> >>
> >> corrcoef(x, y) == corrcoef(c_[x.ravel(), y.ravel()])
> >
> > I don't remember matlab exactly, but I don't think there is a ravel, and
> > I think R also does
> >
> > cov(x, y) = np.dot((x-x.mean()).T, y-y.mean())
> >
> > and normalized for corrcoef.
>
> There's a ravel, according to their docs:
>
> http://www.mathworks.com/help/techdoc/ref/cov.html
>
> """cov(X,Y), where X and Y are matrices with the same number of elements,
> is equivalent to cov([X(:) Y(:)])."""
>
> X(:) is the matlab notation for raveling.
>
FWIW, please note following matlab/ octave behavior:
> X= [1 2 7 3; 2 1 1 2]'
X =
1 2
2 1
7 1
3 2
> Y= [4 2 7 1; 9 1 7 3]'
Y =
4 9
2 1
7 7
1 3
> *corrcoef([X(:) Y(:)]) %(1*
ans =
1.00000 0.26328
0.26328 1.00000
> *corrcoef([X Y]) %(2*
ans =
1.00000 -0.54882 0.69462 0.13884
-0.54882 1.00000 -0.43644 0.31623
0.69462 -0.43644 1.00000 0.69007
0.13884 0.31623 0.69007 1.00000
> *corrcoef(X, Y) %(3*
ans =
0.69462 0.13884
-0.43644 0.31623
and then equivalent numpy:
In []: X= array([[1, 2, 7, 3], [2, 1, 1, 2]])
In []: X
Out[]:
array([[1, 2, 7, 3],
[2, 1, 1, 2]])
In []: Y= array([[4, 2, 7, 1], [9, 1, 7, 3]])
In []: Y
Out[]:
array([[4, 2, 7, 1],
[9, 1, 7, 3]])
In []: *corrcoef(X.ravel(), Y.ravel()) **#(1*
Out[]:
array([[ 1. , 0.26328398],
[ 0.26328398, 1. ]])
In []: *corrcoef(X, Y) #(2*
Out[]:
array([[ 1. , -0.5488213 , 0.69462323, 0.13884203],
[-0.5488213 , 1. , -0.43643578, 0.31622777],
[ 0.69462323, -0.43643578, 1. , 0.69006556],
[ 0.13884203, 0.31622777, 0.69006556, 1. ]])
> corrcoef(X, Y) %(3
In []: *corrcoef(?) #(3*
Out[]:
array([[ 0.69462 0.13884],
[-0.43644 0.31623]])
So perhaps there does not exist any really simple and straightforward
translation
(of corrcoef) from matlab to numpy? Just as an example; how would you
implement case %(3 properly with numpy?
Regards,
eat
>
> --
> Pauli Virtanen
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110303/caea13fd/attachment.html>
More information about the SciPy-User
mailing list