On Thu, Jul 1, 2010 at 8:40 AM, Bruce Southey <bsouthey@gmail.com> wrote:

On 06/29/2010 11:38 PM, David Goldsmith wrote:

On Tue, Jun 29, 2010 at 8:16 PM, Bruce Southey <bsouthey@gmail.com> wrote:

On Tue, Jun 29, 2010 at 3:56 PM, <josef.pktd@gmail.com> wrote:

On Tue, Jun 29, 2010 at 6:37 PM, David Goldsmith <d.l.goldsmith@gmail.com> wrote:

...concerns the behavior of numpy.random.multivariate_normal; if

On Tue, Jun 29, 2010 at 6:03 PM, David Goldsmith <d.l.goldsmith@gmail.com> wrote: that's

of interest to you, I urge you to take a look at the comments (esp. mine :-) ); otherwise, please ignore the noise. Thanks!

You should add the link to the ticket, so it's faster for everyone to check what you are talking about.

Josef

Ooops! Yes I should; here it is:

http://projects.scipy.org/numpy/ticket/1223 Sorry, and thanks, Josef.

DG

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

As I recall, there is no requirement for the variance/covariance of the normal distribution to be positive definite.

No, not positive definite, positive *semi*-definite: yes, the variance may be zero (the cov may have zero-valued eigenvalues), but the claim (and I actually am "neutral" about it, in that I wanted to reference the claim in the docstring and was told that doing so was unnecessary, the implication being that this is a "well-known" fact), is that, in essence (in 1-D) the variance can't be negative, which seems clear enough. I don't see you disputing that, and so I'm uncertain as to how you feel about the proposal to "weakly" enforce symmetry and positive *semi*-definiteness. (Now, if you dispute that even requiring positive *semi*-definiteness is desirable, you'll have to debate that w/ some of the others, because I'm taking their word for it that indefiniteness is "unphysical.")

DG

From http://en.wikipedia.org/wiki/Multivariate_normal_distribution "The covariance matrix is allowed to be singular (in which case the corresponding distribution has no density)."

So you must be able to draw random numbers from such a distribution. Obviously what those numbers really mean is another matter (I presume the dependent variables should be a linear function of the independent variables) but the user *must* know since they entered it. Since the function works the docstring Notes comment must be wrong.

Imposing any restriction means that this is no longer a multivariate normal random number generator. If anything, you can only raise a warning about possible non-positive definiteness but even that will vary depending how it is measured and on the precision being used.

Bruce _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- Mathematician: noun, someone who disavows certainty when their uncertainty set is non-empty, even if that set has measure zero.

Hope: noun, that delusive spirit which escaped Pandora's jar and, with her lies, prevents mankind from committing a general suicide. (As interpreted by Robert Graves)

_______________________________________________ NumPy-Discussion mailing listNumPy-Discussion@scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion

As you (and the theory) say, a variance should not be negative - yeah right :-) In practice that is not exactly true because estimation procedures like equating observed with expected sum of squares do lead to negative estimates. However, that is really a failure of the model, data and algorithm.

I think the issue is really how numpy should handle input when that input is theoretically invalid.

I think the svd version could be used if a check is added for the decomposition. That is, if cov = u*d*v, then dot(u,v) ~= identity. The Cholesky decomposition will be faster than the svd for large arrays, but that might not matter much for the common case. <snip> Chuck