Mailman 3 Bias in numpy.random.multivariate_normal? - SciPy-User

Bias in numpy.random.multivariate_normal?

John Reid

10 Nov 2007 10 Nov '07

1:55 p.m.

Could someone confirm I've got this right? In a multivariate normal (MVN) the covariance is E(X.X') - E(X).E(X'). I sample many times from a MVN with given mean and covariance. I then calculate the covariance of the sample. This covariance matrix should show no bias to be larger or smaller than the distribution's covariance but I see that some entries are always larger and others are smaller. See this code: from numpy.random import multivariate_normal from numpy.linalg import inv from numpy import outer mu = [10.,10.] precision = [[.6, .3], [.4, .3]] sigma = inv(precision) sample_size = 10000 num_tests = 100 for test in xrange(num_tests): samples = multivariate_normal(mu, sigma, [sample_size]) sample_mean = samples.sum(axis=0)/sample_size #print (sample_mean - mu) > 0. sample_covariance = sum(outer(sample-sample_mean, sample-sample_mean) for sample in samples) / sample_size print (sample_covariance - sigma) > 0 'True' is printed when the entry is larger and False otherwise Did I get this wrong or is this acceptable behaviour for a random variate generator or is it a bug? Thanks, John.

Show replies by date

Matthieu Brucher

10 Nov 10 Nov

2 p.m.

New subject: [SciPy-user] Bias in numpy.random.multivariate_normal?

IIRC, you computed the biaised variance, so it is normal to see biais. Try dividing by (sample_size-1), it should be the unbiaised estimator. Matthieu 2007/11/10, John Reid <j.reid@mail.cryst.bbk.ac.uk>:

...

Could someone confirm I've got this right?

In a multivariate normal (MVN) the covariance is E(X.X') - E(X).E(X'). I sample many times from a MVN with given mean and covariance. I then calculate the covariance of the sample. This covariance matrix should show no bias to be larger or smaller than the distribution's covariance but I see that some entries are always larger and others are smaller.

See this code:

from numpy.random import multivariate_normal from numpy.linalg import inv from numpy import outer

mu = [10.,10.] precision = [[.6, .3], [.4, .3]] sigma = inv(precision) sample_size = 10000 num_tests = 100

for test in xrange(num_tests): samples = multivariate_normal(mu, sigma, [sample_size]) sample_mean = samples.sum(axis=0)/sample_size #print (sample_mean - mu) > 0. sample_covariance = sum(outer(sample-sample_mean, sample-sample_mean) for sample in samples) / sample_size print (sample_covariance - sigma) > 0

'True' is printed when the entry is larger and False otherwise

Did I get this wrong or is this acceptable behaviour for a random variate generator or is it a bug?

Thanks, John.

_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user

-- French PhD student Website : http://miles.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher

John Reid

7:18 p.m.

New subject: [SciPy-user] Bias in numpy.random.multivariate_normal?

Matthieu Brucher wrote:

...

IIRC, you computed the biaised variance, so it is normal to see biais. Try dividing by (sample_size-1), it should be the unbiaised estimator.

That wouldn't change the sign of the answer. In any case I get the same results when dividing by sample_size-1.

Bryan Nollett

9:26 p.m.

New subject: [SciPy-user] Bias in numpy.random.multivariate_normal?

The covariance matrix provided in sigma is not symmetric. Forcing symmetry by changing the definition of sigma to sigma = inv(precision) sigma[1,0] = sigma[0,1] yields behavior that I suspect is in line with your expectations. Perhaps a check for symmetry in the cov argument would be desirable addition to multivariate_normal? Bryan On Nov 10, 2007 12:55 PM, John Reid <j.reid@mail.cryst.bbk.ac.uk> wrote:

...

In a multivariate normal (MVN) the covariance is E(X.X') - E(X).E(X'). I sample many times from a MVN with given mean and covariance. I then calculate the covariance of the sample. This covariance matrix should show no bias to be larger or smaller than the distribution's covariance but I see that some entries are always larger and others are smaller.

John Reid

11 Nov 11 Nov

5:32 a.m.

New subject: [SciPy-user] Bias in numpy.random.multivariate_normal?

Bryan Nollett wrote:

...

The covariance matrix provided in sigma is not symmetric.

Good point. I needed some test covariance matrices and I completely forgot the symmetric requirement. There is no way to draw from the Wishart or Inv-Wishart distributions is there? Their pdfs would be handy too. Thanks, John.

Matthieu Brucher

6:21 a.m.

New subject: [SciPy-user] Bias in numpy.random.multivariate_normal?

Did you try the rvs method of those classes ? Matthieu 2007/11/11, John Reid <j.reid@mail.cryst.bbk.ac.uk>:

...

Bryan Nollett wrote:

...
The covariance matrix provided in sigma is not symmetric.

Good point. I needed some test covariance matrices and I completely forgot the symmetric requirement. There is no way to draw from the Wishart or Inv-Wishart distributions is there? Their pdfs would be handy too.

Thanks, John.

_______________________________________________ SciPy-user mailing list SciPy-user@scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user

-- French PhD student Website : http://miles.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher

John Reid

10:19 a.m.

New subject: [SciPy-user] Bias in numpy.random.multivariate_normal?

Matthieu Brucher wrote:

...

Did you try the rvs method of those classes ?

Of which classes? I couldn't see anything in scipy.stats or scipy.stats.distributions to do with the wishart distribution.

David Cournapeau

10:03 p.m.

New subject: [SciPy-user] Bias in numpy.random.multivariate_normal?

John Reid wrote:

...

Matthieu Brucher wrote:

...
Did you try the rvs method of those classes ?

Of which classes? I couldn't see anything in scipy.stats or scipy.stats.distributions to do with the wishart distribution.

I don't think there is. This would be handy, indeed. So handy that I have just created a ticket for it :) http://scipy.org/scipy/scipy/ticket/536 cheers, David

6083

Age (days ago)

6085

Last active (days ago)

List overview

Download

7 comments

4 participants

participants (4)

Bryan Nollett
David Cournapeau
John Reid
Matthieu Brucher

Bias in numpy.random.multivariate_normal?

John Reid

Matthieu Brucher

John Reid

Bryan Nollett

John Reid

Matthieu Brucher

John Reid

David Cournapeau

tags

participants (4)