...concerns the behavior of numpy.random.multivariate_normal; if that's of interest to you, I urge you to take a look at the comments (esp. mine :) ); otherwise, please ignore the noise. Thanks! DG
On Tue, Jun 29, 2010 at 6:37 PM, David Goldsmith <d.l.goldsmith@gmail.com> wrote:
...concerns the behavior of numpy.random.multivariate_normal; if that's of interest to you, I urge you to take a look at the comments (esp. mine :) ); otherwise, please ignore the noise. Thanks!
You should add the link to the ticket, so it's faster for everyone to check what you are talking about. Josef
DG
_______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
On Tue, Jun 29, 2010 at 3:56 PM, <josef.pktd@gmail.com> wrote:
On Tue, Jun 29, 2010 at 6:37 PM, David Goldsmith <d.l.goldsmith@gmail.com> wrote:
...concerns the behavior of numpy.random.multivariate_normal; if that's of interest to you, I urge you to take a look at the comments (esp. mine :) ); otherwise, please ignore the noise. Thanks!
You should add the link to the ticket, so it's faster for everyone to check what you are talking about.
Josef
Ooops! Yes I should; here it is: http://projects.scipy.org/numpy/ticket/1223 Sorry, and thanks, Josef. DG
On Tue, Jun 29, 2010 at 6:03 PM, David Goldsmith <d.l.goldsmith@gmail.com> wrote:
On Tue, Jun 29, 2010 at 3:56 PM, <josef.pktd@gmail.com> wrote:
On Tue, Jun 29, 2010 at 6:37 PM, David Goldsmith <d.l.goldsmith@gmail.com> wrote:
...concerns the behavior of numpy.random.multivariate_normal; if that's of interest to you, I urge you to take a look at the comments (esp. mine :) ); otherwise, please ignore the noise. Thanks!
You should add the link to the ticket, so it's faster for everyone to check what you are talking about.
Josef
Ooops! Yes I should; here it is:
http://projects.scipy.org/numpy/ticket/1223 Sorry, and thanks, Josef.
DG
_______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
As I recall, there is no requirement for the variance/covariance of the normal distribution to be positive definite.
From http://en.wikipedia.org/wiki/Multivariate_normal_distribution "The covariance matrix is allowed to be singular (in which case the corresponding distribution has no density)."
So you must be able to draw random numbers from such a distribution. Obviously what those numbers really mean is another matter (I presume the dependent variables should be a linear function of the independent variables) but the user *must* know since they entered it. Since the function works the docstring Notes comment must be wrong. Imposing any restriction means that this is no longer a multivariate normal random number generator. If anything, you can only raise a warning about possible nonpositive definiteness but even that will vary depending how it is measured and on the precision being used. Bruce
On Tue, Jun 29, 2010 at 8:16 PM, Bruce Southey <bsouthey@gmail.com> wrote:
On Tue, Jun 29, 2010 at 3:56 PM, <josef.pktd@gmail.com> wrote:
On Tue, Jun 29, 2010 at 6:37 PM, David Goldsmith <d.l.goldsmith@gmail.com> wrote:
...concerns the behavior of numpy.random.multivariate_normal; if
On Tue, Jun 29, 2010 at 6:03 PM, David Goldsmith <d.l.goldsmith@gmail.com> wrote: that's
of interest to you, I urge you to take a look at the comments (esp. mine :) ); otherwise, please ignore the noise. Thanks!
You should add the link to the ticket, so it's faster for everyone to check what you are talking about.
Josef
Ooops! Yes I should; here it is:
http://projects.scipy.org/numpy/ticket/1223 Sorry, and thanks, Josef.
DG
_______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
As I recall, there is no requirement for the variance/covariance of the normal distribution to be positive definite.
No, not positive definite, positive *semi*definite: yes, the variance may be zero (the cov may have zerovalued eigenvalues), but the claim (and I actually am "neutral" about it, in that I wanted to reference the claim in the docstring and was told that doing so was unnecessary, the implication being that this is a "wellknown" fact), is that, in essence (in 1D) the variance can't be negative, which seems clear enough. I don't see you disputing that, and so I'm uncertain as to how you feel about the proposal to "weakly" enforce symmetry and positive *semi*definiteness. (Now, if you dispute that even requiring positive *semi*definiteness is desirable, you'll have to debate that w/ some of the others, because I'm taking their word for it that indefiniteness is "unphysical.") DG
From http://en.wikipedia.org/wiki/Multivariate_normal_distribution "The covariance matrix is allowed to be singular (in which case the corresponding distribution has no density)."
So you must be able to draw random numbers from such a distribution. Obviously what those numbers really mean is another matter (I presume the dependent variables should be a linear function of the independent variables) but the user *must* know since they entered it. Since the function works the docstring Notes comment must be wrong. Imposing any restriction means that this is no longer a multivariate normal random number generator. If anything, you can only raise a warning about possible nonpositive definiteness but even that will vary depending how it is measured and on the precision being used. Bruce _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion  Mathematician: noun, someone who disavows certainty when their uncertainty set is nonempty, even if that set has measure zero. Hope: noun, that delusive spirit which escaped Pandora's jar and, with her lies, prevents mankind from committing a general suicide. (As interpreted by Robert Graves)
On 06/29/2010 11:38 PM, David Goldsmith wrote:
On Tue, Jun 29, 2010 at 8:16 PM, Bruce Southey <bsouthey@gmail.com <mailto:bsouthey@gmail.com>> wrote:
On Tue, Jun 29, 2010 at 6:03 PM, David Goldsmith <d.l.goldsmith@gmail.com <mailto:d.l.goldsmith@gmail.com>> wrote: > On Tue, Jun 29, 2010 at 3:56 PM, <josef.pktd@gmail.com <mailto:josef.pktd@gmail.com>> wrote: >> >> On Tue, Jun 29, 2010 at 6:37 PM, David Goldsmith >> <d.l.goldsmith@gmail.com <mailto:d.l.goldsmith@gmail.com>> wrote: >> > ...concerns the behavior of numpy.random.multivariate_normal; if that's >> > of >> > interest to you, I urge you to take a look at the comments (esp. mine >> > :) ); >> > otherwise, please ignore the noise. Thanks! >> >> You should add the link to the ticket, so it's faster for everyone to >> check what you are talking about. >> >> Josef > > Ooops! Yes I should; here it is: > > http://projects.scipy.org/numpy/ticket/1223 > Sorry, and thanks, Josef. > > DG > > > _______________________________________________ > NumPyDiscussion mailing list > NumPyDiscussion@scipy.org <mailto:NumPyDiscussion@scipy.org> > http://mail.scipy.org/mailman/listinfo/numpydiscussion > > As I recall, there is no requirement for the variance/covariance of the normal distribution to be positive definite.
No, not positive definite, positive *semi*definite: yes, the variance may be zero (the cov may have zerovalued eigenvalues), but the claim (and I actually am "neutral" about it, in that I wanted to reference the claim in the docstring and was told that doing so was unnecessary, the implication being that this is a "wellknown" fact), is that, in essence (in 1D) the variance can't be negative, which seems clear enough. I don't see you disputing that, and so I'm uncertain as to how you feel about the proposal to "weakly" enforce symmetry and positive *semi*definiteness. (Now, if you dispute that even requiring positive *semi*definiteness is desirable, you'll have to debate that w/ some of the others, because I'm taking their word for it that indefiniteness is "unphysical.")
DG
From http://en.wikipedia.org/wiki/Multivariate_normal_distribution "The covariance matrix is allowed to be singular (in which case the corresponding distribution has no density)."
So you must be able to draw random numbers from such a distribution. Obviously what those numbers really mean is another matter (I presume the dependent variables should be a linear function of the independent variables) but the user *must* know since they entered it. Since the function works the docstring Notes comment must be wrong.
Imposing any restriction means that this is no longer a multivariate normal random number generator. If anything, you can only raise a warning about possible nonpositive definiteness but even that will vary depending how it is measured and on the precision being used.
Bruce _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org <mailto:NumPyDiscussion@scipy.org> http://mail.scipy.org/mailman/listinfo/numpydiscussion
 Mathematician: noun, someone who disavows certainty when their uncertainty set is nonempty, even if that set has measure zero.
Hope: noun, that delusive spirit which escaped Pandora's jar and, with her lies, prevents mankind from committing a general suicide. (As interpreted by Robert Graves)
_______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
As you (and the theory) say, a variance should not be negative  yeah right :) In practice that is not exactly true because estimation procedures like equating observed with expected sum of squares do lead to negative estimates. However, that is really a failure of the model, data and algorithm. I think the issue is really how numpy should handle input when that input is theoretically invalid. I (and apparent the bug submitter) do not know what to expect if the input is not positive definite. If the svd approach is correct for such cases and numpy 'trusts' the user, as the usual case, then there is no issue. If the svd approach is incorrect for such cases then that is obviously a bug. If numpy can not trust the user then numpy has to check and either raise a warning or error if the input variances are greater than or equal to zero and that the cov argument is symmetric. Replacing the SVD with cholesky would also address these issues as both of these are checked by numpy's cholesky function. However, cholesky() does not support semipositive covariance/variance input (which is possible http://en.wikipedia.org/wiki/Cholesky_decomposition#Proof_for_positive_semi...). Also as Robert said in the thread that 'Cholesky decomposition gave an error "too soon" in my estimation'. Bruce
On Thu, Jul 1, 2010 at 8:40 AM, Bruce Southey <bsouthey@gmail.com> wrote:
On 06/29/2010 11:38 PM, David Goldsmith wrote:
On Tue, Jun 29, 2010 at 8:16 PM, Bruce Southey <bsouthey@gmail.com> wrote:
On Tue, Jun 29, 2010 at 3:56 PM, <josef.pktd@gmail.com> wrote:
On Tue, Jun 29, 2010 at 6:37 PM, David Goldsmith <d.l.goldsmith@gmail.com> wrote:
...concerns the behavior of numpy.random.multivariate_normal; if
On Tue, Jun 29, 2010 at 6:03 PM, David Goldsmith <d.l.goldsmith@gmail.com> wrote: that's
of interest to you, I urge you to take a look at the comments (esp. mine :) ); otherwise, please ignore the noise. Thanks!
You should add the link to the ticket, so it's faster for everyone to check what you are talking about.
Josef
Ooops! Yes I should; here it is:
http://projects.scipy.org/numpy/ticket/1223 Sorry, and thanks, Josef.
DG
_______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
As I recall, there is no requirement for the variance/covariance of the normal distribution to be positive definite.
No, not positive definite, positive *semi*definite: yes, the variance may be zero (the cov may have zerovalued eigenvalues), but the claim (and I actually am "neutral" about it, in that I wanted to reference the claim in the docstring and was told that doing so was unnecessary, the implication being that this is a "wellknown" fact), is that, in essence (in 1D) the variance can't be negative, which seems clear enough. I don't see you disputing that, and so I'm uncertain as to how you feel about the proposal to "weakly" enforce symmetry and positive *semi*definiteness. (Now, if you dispute that even requiring positive *semi*definiteness is desirable, you'll have to debate that w/ some of the others, because I'm taking their word for it that indefiniteness is "unphysical.")
DG
From http://en.wikipedia.org/wiki/Multivariate_normal_distribution "The covariance matrix is allowed to be singular (in which case the corresponding distribution has no density)."
So you must be able to draw random numbers from such a distribution. Obviously what those numbers really mean is another matter (I presume the dependent variables should be a linear function of the independent variables) but the user *must* know since they entered it. Since the function works the docstring Notes comment must be wrong.
Imposing any restriction means that this is no longer a multivariate normal random number generator. If anything, you can only raise a warning about possible nonpositive definiteness but even that will vary depending how it is measured and on the precision being used.
Bruce _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
 Mathematician: noun, someone who disavows certainty when their uncertainty set is nonempty, even if that set has measure zero.
Hope: noun, that delusive spirit which escaped Pandora's jar and, with her lies, prevents mankind from committing a general suicide. (As interpreted by Robert Graves)
_______________________________________________ NumPyDiscussion mailing listNumPyDiscussion@scipy.orghttp://mail.scipy.org/mailman/listinfo/numpydiscussion
As you (and the theory) say, a variance should not be negative  yeah right :) In practice that is not exactly true because estimation procedures like equating observed with expected sum of squares do lead to negative estimates. However, that is really a failure of the model, data and algorithm.
I think the issue is really how numpy should handle input when that input is theoretically invalid.
I think the svd version could be used if a check is added for the decomposition. That is, if cov = u*d*v, then dot(u,v) ~= identity. The Cholesky decomposition will be faster than the svd for large arrays, but that might not matter much for the common case. <snip> Chuck
On Thu, Jul 1, 2010 at 9:11 AM, Charles R Harris <charlesr.harris@gmail.com>wrote:
On Thu, Jul 1, 2010 at 8:40 AM, Bruce Southey <bsouthey@gmail.com> wrote:
On 06/29/2010 11:38 PM, David Goldsmith wrote:
On Tue, Jun 29, 2010 at 8:16 PM, Bruce Southey <bsouthey@gmail.com>wrote:
On Tue, Jun 29, 2010 at 3:56 PM, <josef.pktd@gmail.com> wrote:
On Tue, Jun 29, 2010 at 6:37 PM, David Goldsmith <d.l.goldsmith@gmail.com> wrote:
...concerns the behavior of numpy.random.multivariate_normal; if
On Tue, Jun 29, 2010 at 6:03 PM, David Goldsmith <d.l.goldsmith@gmail.com> wrote: that's
of interest to you, I urge you to take a look at the comments (esp. mine :) ); otherwise, please ignore the noise. Thanks!
You should add the link to the ticket, so it's faster for everyone to check what you are talking about.
Josef
Ooops! Yes I should; here it is:
http://projects.scipy.org/numpy/ticket/1223 Sorry, and thanks, Josef.
DG
_______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
As I recall, there is no requirement for the variance/covariance of the normal distribution to be positive definite.
No, not positive definite, positive *semi*definite: yes, the variance may be zero (the cov may have zerovalued eigenvalues), but the claim (and I actually am "neutral" about it, in that I wanted to reference the claim in the docstring and was told that doing so was unnecessary, the implication being that this is a "wellknown" fact), is that, in essence (in 1D) the variance can't be negative, which seems clear enough. I don't see you disputing that, and so I'm uncertain as to how you feel about the proposal to "weakly" enforce symmetry and positive *semi*definiteness. (Now, if you dispute that even requiring positive *semi*definiteness is desirable, you'll have to debate that w/ some of the others, because I'm taking their word for it that indefiniteness is "unphysical.")
DG
From http://en.wikipedia.org/wiki/Multivariate_normal_distribution "The covariance matrix is allowed to be singular (in which case the corresponding distribution has no density)."
So you must be able to draw random numbers from such a distribution. Obviously what those numbers really mean is another matter (I presume the dependent variables should be a linear function of the independent variables) but the user *must* know since they entered it. Since the function works the docstring Notes comment must be wrong.
Imposing any restriction means that this is no longer a multivariate normal random number generator. If anything, you can only raise a warning about possible nonpositive definiteness but even that will vary depending how it is measured and on the precision being used.
Bruce _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
 Mathematician: noun, someone who disavows certainty when their uncertainty set is nonempty, even if that set has measure zero.
Hope: noun, that delusive spirit which escaped Pandora's jar and, with her lies, prevents mankind from committing a general suicide. (As interpreted by Robert Graves)
_______________________________________________ NumPyDiscussion mailing listNumPyDiscussion@scipy.orghttp://mail.scipy.org/mailman/listinfo/numpydiscussion
As you (and the theory) say, a variance should not be negative  yeah right :) In practice that is not exactly true because estimation procedures like equating observed with expected sum of squares do lead to negative estimates. However, that is really a failure of the model, data and algorithm.
I think the issue is really how numpy should handle input when that input is theoretically invalid.
I think the svd version could be used if a check is added for the decomposition. That is, if cov = u*d*v, then dot(u,v) ~= identity. The Cholesky decomposition will be faster than the svd for large arrays, but that might not matter much for the common case.
<snip>
Chuck
Well, I'm not sure if what we have so far implies that consensus will possibly be impossible to reach, so I'll just rest on my laurels (i.e., my proposed compromise solution); just let me know if the docstring needs to be changed (and how). DG
participants (4)

Bruce Southey

Charles R Harris

David Goldsmith

josef.pktd＠gmail.com