On Thu, Jul 1, 2010 at 9:11 AM, Charles R Harris <charlesr.harris@gmail.com> wrote:

On Thu, Jul 1, 2010 at 8:40 AM, Bruce Southey <bsouthey@gmail.com> wrote:
On 06/29/2010 11:38 PM, David Goldsmith wrote:
On Tue, Jun 29, 2010 at 8:16 PM, Bruce Southey <bsouthey@gmail.com> wrote:
On Tue, Jun 29, 2010 at 6:03 PM, David Goldsmith
<d.l.goldsmith@gmail.com> wrote:
> On Tue, Jun 29, 2010 at 3:56 PM, <josef.pktd@gmail.com> wrote:
>>
>> On Tue, Jun 29, 2010 at 6:37 PM, David Goldsmith
>> <d.l.goldsmith@gmail.com> wrote:
>> > ...concerns the behavior of numpy.random.multivariate_normal; if that's
>> > of
>> > interest to you, I urge you to take a look at the comments (esp. mine
>> > :-) );
>> > otherwise, please ignore the noise.  Thanks!
>>
>> You should add the link to the ticket, so it's faster for everyone to
>> check what you are talking about.
>>
>> Josef
>
> Ooops!  Yes I should; here it is:
>
> http://projects.scipy.org/numpy/ticket/1223
> Sorry, and thanks, Josef.
>
> DG
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
As I recall, there is no requirement for the variance/covariance of
the normal distribution to be positive definite.

No, not positive definite, positive *semi*-definite: yes, the variance may be zero (the cov may have zero-valued eigenvalues), but the claim (and I actually am "neutral" about it, in that I wanted to reference the claim in the docstring and was told that doing so was unnecessary, the implication being that this is a "well-known" fact), is that, in essence (in 1-D) the variance can't be negative, which seems clear enough.  I don't see you disputing that, and so I'm uncertain as to how you feel about the proposal to "weakly" enforce symmetry and positive *semi*-definiteness.  (Now, if you dispute that even requiring positive *semi*-definiteness is desirable, you'll have to debate that w/ some of the others, because I'm taking their word for it that indefiniteness is "unphysical.")

DG

>From http://en.wikipedia.org/wiki/Multivariate_normal_distribution
"The covariance matrix is allowed to be singular (in which case the
corresponding distribution has no density)."

So you must be able to draw random numbers from such a distribution.
Obviously what those numbers really mean is another matter (I presume
the dependent variables should be a linear function of the independent
variables) but the user *must* know since they entered it. Since the
function works the docstring Notes comment must be wrong.

Imposing any restriction means that this is no longer a multivariate
normal random number generator. If anything, you can only raise a
warning about possible non-positive definiteness but even that will
vary depending how it is measured and on the precision being used.


Bruce
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion



--
Mathematician: noun, someone who disavows certainty when their uncertainty set is non-empty, even if that set has measure zero.

Hope: noun, that delusive spirit which escaped Pandora's jar and, with her lies, prevents mankind from committing a general suicide.  (As interpreted by Robert Graves)
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
As you (and the theory) say, a variance should not be negative - yeah right :-) In practice that is not exactly true because estimation procedures like equating observed with expected sum of squares do lead to negative estimates. However, that is really a failure of the model, data and algorithm.

I think the issue is really how numpy should handle input when that input is theoretically invalid.


I think the svd version could be used if a check is added for the decomposition. That is, if cov = u*d*v, then dot(u,v) ~= identity. The Cholesky decomposition will be faster than the svd for large arrays, but that might not matter much for the common case.

<snip>

Chuck

Well, I'm not sure if what we have so far implies that consensus will possibly be impossible to reach, so I'll just rest on my laurels (i.e., my proposed compromise solution); just let me know if the docstring needs to be changed (and how).

DG