I noticed that if I generate complex rv i.i.d. with var=1, that numpy says: var (<real part>) -> (close to 1.0) var (<imag part>) -> (close to 1.0) but var (complex array) -> (close to complex 0) Is that not a strange definition?
Neal Becker wrote:
I noticed that if I generate complex rv i.i.d. with var=1, that numpy says:
var (<real part>) -> (close to 1.0) var (<imag part>) -> (close to 1.0)
but
var (complex array) -> (close to complex 0)
Is that not a strange definition?
There is some discussion on this in the tracker. http://projects.scipy.org/scipy/numpy/ticket/638 The current state of affairs is that the implementation of var() just naively applies the standard formula for real numbers. mean((x - mean(x)) ** 2) I think this is pretty obviously wrong prima facie. AFAIK, no one considers this a valid definition of variance for complex RVs or in fact a useful value. I think we should change this. Unfortunately, there is no single alternative but several. 1. Punt. Complex numbers are inherently multidimensional, and a single scale parameter doesn't really describe most distributions of complex numbers. Instead, you need a real covariance matrix which you can get with cov([z.real, z.imag]). This estimates the covariance matrix of a 2-D Gaussian distribution over RR^2 (interpreted as CC). 2. Take a slightly less naive formula for the variance which seems to show up in some texts: mean(absolute(z - mean(z)) ** 2) This estimates the single parameter of a circular Gaussian over RR^2 (interpreted as CC). It is also the trace of the covariance matrix above. 3. Take the variances of the real and imaginary components independently. This is equivalent to taking the diagonal of the covariance matrix above. This wouldn't be the definition of "*the* complex variance" that anyone else uses, but rather another form of punting. "There isn't a single complex variance to give you, but in the spirit of broadcasting, we'll compute the marginal variances of each dimension independently." Personally, I like 1 a lot. I'm hesitant to support 2 until I've seen an actual application of that definition. The references I have been given in the ticket comments are all early parts of books where the authors are laying out definitions without applications. Personally, it feels to me like the authors are just sticking in the absolute()'s ex post facto just so they can extend the definition they already have to complex numbers. I'm also not a fan of the expectation-centric treatments of random variables. IMO, the variance of an arbitrary RV isn't an especially important quantity. It's a parameter of a Gaussian distribution, and in this case, I see no reason to favor circular Gaussians in CC over general ones. But if someone shows me an actual application of the definition, I can amend my view. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Robert Kern wrote:
Neal Becker wrote:
I noticed that if I generate complex rv i.i.d. with var=1, that numpy says:
var (<real part>) -> (close to 1.0) var (<imag part>) -> (close to 1.0)
but
var (complex array) -> (close to complex 0)
Is that not a strange definition?
There is some discussion on this in the tracker.
http://projects.scipy.org/scipy/numpy/ticket/638
The current state of affairs is that the implementation of var() just naively applies the standard formula for real numbers.
mean((x - mean(x)) ** 2)
I think this is pretty obviously wrong prima facie. AFAIK, no one considers this a valid definition of variance for complex RVs or in fact a useful value. I think we should change this. Unfortunately, there is no single alternative but several.
1. Punt. Complex numbers are inherently multidimensional, and a single scale parameter doesn't really describe most distributions of complex numbers. Instead, you need a real covariance matrix which you can get with cov([z.real, z.imag]). This estimates the covariance matrix of a 2-D Gaussian distribution over RR^2 (interpreted as CC).
2. Take a slightly less naive formula for the variance which seems to show up in some texts:
mean(absolute(z - mean(z)) ** 2)
This estimates the single parameter of a circular Gaussian over RR^2 (interpreted as CC). It is also the trace of the covariance matrix above.
3. Take the variances of the real and imaginary components independently. This is equivalent to taking the diagonal of the covariance matrix above. This wouldn't be the definition of "*the* complex variance" that anyone else uses, but rather another form of punting. "There isn't a single complex variance to give you, but in the spirit of broadcasting, we'll compute the marginal variances of each dimension independently."
Personally, I like 1 a lot. I'm hesitant to support 2 until I've seen an actual application of that definition. The references I have been given in the ticket comments are all early parts of books where the authors are laying out definitions without applications. Personally, it feels to me like the authors are just sticking in the absolute()'s ex post facto just so they can extend the definition they already have to complex numbers. I'm also not a fan of the expectation-centric treatments of random variables. IMO, the variance of an arbitrary RV isn't an especially important quantity. It's a parameter of a Gaussian distribution, and in this case, I see no reason to favor circular Gaussians in CC over general ones.
But if someone shows me an actual application of the definition, I can amend my view.
2 is what I expected. Suppose I have a complex signal x, with additive Gaussian noise (i.i.d, real and imag are independent). y = x + n Consider an estimate \hat{x} = y. What is the mean-squared-error E[(y - x)^2] ? Definition 2 is consistent with that, and gets my vote.
Neal Becker wrote:
2 is what I expected. Suppose I have a complex signal x, with additive Gaussian noise (i.i.d, real and imag are independent). y = x + n
Not only do the real and imag marginal distributions have to be independent, but also of the same scale, i.e. Re(n) ~ Gaussian(0, sigma) and Im(n) ~ Gaussian(0, sigma) for the same sigma.
Consider an estimate \hat{x} = y.
What is the mean-squared-error E[(y - x)^2] ?
Definition 2 is consistent with that, and gets my vote.
Ah, you have to be careful. What you wrote is what is implemented. Definition 2 is consistent with this, instead: E[|y - x|^2] But like I said, I see no particular reason to favor circular Gaussians over the general form for the implementation of numpy.var(). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
On Jan 8, 2008 6:54 PM, Robert Kern <robert.kern@gmail.com> wrote:
Neal Becker wrote:
I noticed that if I generate complex rv i.i.d. with var=1, that numpy says:
var (<real part>) -> (close to 1.0) var (<imag part>) -> (close to 1.0)
but
var (complex array) -> (close to complex 0)
Is that not a strange definition?
There is some discussion on this in the tracker.
http://projects.scipy.org/scipy/numpy/ticket/638
The current state of affairs is that the implementation of var() just naively applies the standard formula for real numbers.
mean((x - mean(x)) ** 2)
I think this is pretty obviously wrong prima facie. AFAIK, no one considers this a valid definition of variance for complex RVs or in fact a useful value. I think we should change this. Unfortunately, there is no single alternative but several.
1. Punt. Complex numbers are inherently multidimensional, and a single scale parameter doesn't really describe most distributions of complex numbers. Instead, you need a real covariance matrix which you can get with cov([ z.real, z.imag]). This estimates the covariance matrix of a 2-D Gaussian distribution over RR^2 (interpreted as CC).
2. Take a slightly less naive formula for the variance which seems to show up in some texts:
mean(absolute(z - mean(z)) ** 2)
This estimates the single parameter of a circular Gaussian over RR^2 (interpreted as CC). It is also the trace of the covariance matrix above.
3. Take the variances of the real and imaginary components independently. This is equivalent to taking the diagonal of the covariance matrix above. This wouldn't be the definition of "*the* complex variance" that anyone else uses, but rather another form of punting. "There isn't a single complex variance to give you, but in the spirit of broadcasting, we'll compute the marginal variances of each dimension independently."
Personally, I like 1 a lot. I'm hesitant to support 2 until I've seen an actual application of that definition. The references I have been given in the ticket comments are all early parts of books where the authors are laying out definitions without applications. Personally, it feels to me like the authors are just sticking in the absolute()'s ex post facto just so they can extend the definition they already have to complex numbers. I'm also not a fan of the expectation-centric treatments of random variables. IMO, the variance of an arbitrary RV isn't an especially important quantity. It's a parameter of a Gaussian distribution, and in this case, I see no reason to favor circular Gaussians in CC over general ones.
But if someone shows me an actual application of the definition, I can amend my view.
Suppose you have a set of z_i and want to choose z to minimize the average square error $ \sum_i |z_i - z|^2 $. The solution is that $z=\mean{z_i}$ and the resulting average error is given by 2). Note that I didn't mention Gaussians anywhere. No distribution is needed to justify the argument, just the idea of minimizing the squared distance. Leaving out the ^2 would yield another metric, or one could ask for a minmax solution. It is a question of the distance function, not probability. Anyway, that is one justification for the approach in 2) and it is one that makes a lot of applied math simple. Whether of not a least squares fit is useful is different question. Chuck
Charles R Harris wrote:
Suppose you have a set of z_i and want to choose z to minimize the average square error $ \sum_i |z_i - z|^2 $. The solution is that $z=\mean{z_i}$ and the resulting average error is given by 2). Note that I didn't mention Gaussians anywhere. No distribution is needed to justify the argument, just the idea of minimizing the squared distance. Leaving out the ^2 would yield another metric, or one could ask for a minmax solution. It is a question of the distance function, not probability. Anyway, that is one justification for the approach in 2) and it is one that makes a lot of applied math simple. Whether of not a least squares fit is useful is different question.
If you're not doing probability, then what are you using var() for? I can accept that the quantity is meaningful for your problem, but I'm not convinced it's a variance. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
On Jan 8, 2008 7:48 PM, Robert Kern <robert.kern@gmail.com> wrote:
Charles R Harris wrote:
Suppose you have a set of z_i and want to choose z to minimize the average square error $ \sum_i |z_i - z|^2 $. The solution is that $z=\mean{z_i}$ and the resulting average error is given by 2). Note that I didn't mention Gaussians anywhere. No distribution is needed to justify the argument, just the idea of minimizing the squared distance. Leaving out the ^2 would yield another metric, or one could ask for a minmax solution. It is a question of the distance function, not probability. Anyway, that is one justification for the approach in 2) and it is one that makes a lot of applied math simple. Whether of not a least squares fit is useful is different question.
If you're not doing probability, then what are you using var() for? I can accept that the quantity is meaningful for your problem, but I'm not convinced it's a variance.
Lots of fits don't involve probability distributions. For instance, one might want to fit a polynomial to a mathematical curve. This sort of distinction between probability and distance goes back to Gauss himself, although not in his original work on least squares. Whether or not variance implies probability is a semantic question. I think if we are going to compute a single number, 2) is as good as anything even if it doesn't capture the shape of the scatter plot. A 2D covariance wouldn't necessarily capture the shape either. Chuck
Charles R Harris wrote:
On Jan 8, 2008 7:48 PM, Robert Kern <robert.kern@gmail.com <mailto:robert.kern@gmail.com>> wrote:
Charles R Harris wrote:
> Suppose you have a set of z_i and want to choose z to minimize the > average square error $ \sum_i |z_i - z|^2 $. The solution is that > $z=\mean{z_i}$ and the resulting average error is given by 2). Note that > I didn't mention Gaussians anywhere. No distribution is needed to > justify the argument, just the idea of minimizing the squared distance. > Leaving out the ^2 would yield another metric, or one could ask for a > minmax solution. It is a question of the distance function, not > probability. Anyway, that is one justification for the approach in 2) > and it is one that makes a lot of applied math simple. Whether of not a > least squares fit is useful is different question.
If you're not doing probability, then what are you using var() for? I can accept that the quantity is meaningful for your problem, but I'm not convinced it's a variance.
Lots of fits don't involve probability distributions. For instance, one might want to fit a polynomial to a mathematical curve. This sort of distinction between probability and distance goes back to Gauss himself, although not in his original work on least squares. Whether or not variance implies probability is a semantic question.
Well, the problem in front of us is entirely semantics: What does the string "var(z)" mean? Are we going to choose an mechanistic definition: "var(z) is implemented in such and such a way and interpretations are left open"? In that case, why are we using the string "var(z)" rather than something else? We're also still left with the question as to which such and such implementation to use. Alternatively, we can look at what people call "variances" and try to implement the calculation of such. In that case, the term "variance" tends to crop up (and in my experience *only* crop up) in statistics and probability. Certain implementations of the calculations of such quantities have cognates elsewhere, but those cognates are not themselves called variances. My question to you is, is "the resulting average error" a variance? I.e., do people call it a variance outside of S&P? There are any number of computations that are useful but are not variances, and I don't think we should make "var(z)" implement them. In S&P, the single quantity "variance" is well defined for real RVs, even if you step away from Gaussians. It's the second central moment of the PDF of the RV. When you move up to CC (or RR^2), the definition of "moment" changes. It's no longer a real number or even a scalar; the second central moment is a covariance matrix. If we're going to call something "the variance", that's it. The circularly symmetric forms are special cases. Although option #2 is a useful quantity to calculate in some circumstances, I think it's bogus to give it a special status.
I think if we are going to compute a single number, 2) is as good as anything even if it doesn't capture the shape of the scatter plot. A 2D covariance wouldn't necessarily capture the shape either.
True, but it is clear exactly what it is. The function is named "cov()", and it computes covariances. It's not called "shape_of_2D_pdf()". Whether or not one ought to compute a covariance is not "cov()"'s problem. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Robert Kern wrote:
Neal Becker wrote:
I noticed that if I generate complex rv i.i.d. with var=1, that numpy says:
var (<real part>) -> (close to 1.0) var (<imag part>) -> (close to 1.0)
but
var (complex array) -> (close to complex 0)
Is that not a strange definition?
2. Take a slightly less naive formula for the variance which seems to show up in some texts:
mean(absolute(z - mean(z)) ** 2)
This estimates the single parameter of a circular Gaussian over RR^2 (interpreted as CC). It is also the trace of the covariance matrix above.
I tend to favor this interpretation because it is used quite heavily in signal processing applications where "circular" Gaussian random variables show up quite a bit --- so much so, in fact, that most EE folks would expect this as the output and you would have to explain to them why there may be other choices that make sense. So, #2 is kind of a nod to the signal-processing community (especially the communication section). But, there is also merit to me in #3 (although it may be harder to explain why the variance returns a complex number --- if that is what you meant). -Travis O.
3. Take the variances of the real and imaginary components independently. This is equivalent to taking the diagonal of the covariance matrix above. This wouldn't be the definition of "*the* complex variance" that anyone else uses, but rather another form of punting. "There isn't a single complex variance to give you, but in the spirit of broadcasting, we'll compute the marginal variances of each dimension independently."
Personally, I like 1 a lot. I'm hesitant to support 2 until I've seen an actual application of that definition. The references I have been given in the ticket comments are all early parts of books where the authors are laying out definitions without applications. Personally, it feels to me like the authors are just sticking in the absolute()'s ex post facto just so they can extend the definition they already have to complex numbers. I'm also not a fan of the expectation-centric treatments of random variables. IMO, the variance of an arbitrary RV isn't an especially important quantity. It's a parameter of a Gaussian distribution, and in this case, I see no reason to favor circular Gaussians in CC over general ones.
But if someone shows me an actual application of the definition, I can amend my view.
Travis E. Oliphant wrote:
Robert Kern wrote:
Neal Becker wrote:
I noticed that if I generate complex rv i.i.d. with var=1, that numpy says:
var (<real part>) -> (close to 1.0) var (<imag part>) -> (close to 1.0)
but
var (complex array) -> (close to complex 0)
Is that not a strange definition?
2. Take a slightly less naive formula for the variance which seems to show up in some texts:
mean(absolute(z - mean(z)) ** 2)
This estimates the single parameter of a circular Gaussian over RR^2 (interpreted as CC). It is also the trace of the covariance matrix above.
I tend to favor this interpretation because it is used quite heavily in signal processing applications where "circular" Gaussian random variables show up quite a bit --- so much so, in fact, that most EE folks would expect this as the output and you would have to explain to them why there may be other choices that make sense.
So, #2 is kind of a nod to the signal-processing community (especially the communication section).
<sigh> Fair enough. I relent. You implement it; I'll document the correct^Wcov() alternative. :-)
But, there is also merit to me in #3 (although it may be harder to explain why the variance returns a complex number --- if that is what you meant).
Yes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Robert Kern wrote:
Travis E. Oliphant wrote:
Robert Kern wrote:
Neal Becker wrote:
I noticed that if I generate complex rv i.i.d. with var=1, that numpy says:
var (<real part>) -> (close to 1.0) var (<imag part>) -> (close to 1.0)
but
var (complex array) -> (close to complex 0)
Is that not a strange definition?
2. Take a slightly less naive formula for the variance which seems to show up in some texts:
mean(absolute(z - mean(z)) ** 2)
This estimates the single parameter of a circular Gaussian over RR^2 (interpreted as CC). It is also the trace of the covariance matrix above.
I tend to favor this interpretation because it is used quite heavily in signal processing applications where "circular" Gaussian random variables show up quite a bit --- so much so, in fact, that most EE folks would expect this as the output and you would have to explain to them why there may be other choices that make sense.
So, #2 is kind of a nod to the signal-processing community (especially the communication section).
<sigh> Fair enough. I relent. You implement it; I'll document the correct^Wcov() alternative. :-)
Not that I find the argument pertinent most of the time, but if there is no clear argument in favor of one formula, would following matlab conventions be ok ? To me, the definition 2 makes more sense, as a perticular case of the correlation between two different complex random variables: \mathbb{E}[X \bar{Y}], such as it keeps the nice properties of scalar product. cheers, David
Neal Becker wrote:
I noticed that if I generate complex rv i.i.d. with var=1, that numpy says:
var (<real part>) -> (close to 1.0) var (<imag part>) -> (close to 1.0)
but
var (complex array) -> (close to complex 0)
Is that not a strange definition?
I don't think there is any ambiguity about the definition of the variance of complex. Var(x) = E{(x-E[x])^2} = E{x}^2 - E{x} An estimator for this: E[x^n] \approx (1/M)\sum(x^n)
Neal Becker wrote:
Neal Becker wrote:
I noticed that if I generate complex rv i.i.d. with var=1, that numpy says:
var (<real part>) -> (close to 1.0) var (<imag part>) -> (close to 1.0)
but
var (complex array) -> (close to complex 0)
Is that not a strange definition?
I don't think there is any ambiguity about the definition of the variance of complex.
Var(x) = E{(x-E[x])^2} = E{x}^2 - E{x}
That's currently what's implemented, but there is simply no evidence that anyone ever uses this definition for complex random variables. Note that for real variables, E{(x-E[x])^2} = E{|x-E[x]|^2} but for complex variables, there is a large difference. Since the || are superfluous with real variables, probability texts rarely include them unless if they are then going on to talk about complex variables. If you want to extend the definition for real variables to complex variables, that is an ambiguity you have to resolve. There is, apparently, a large body of statistical signal processing literature that defines the complex variance as E{|z-E[z]|^2} so, I (now) believe that this is what should be implemented. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
participants (5)
-
Charles R Harris -
David Cournapeau -
Neal Becker -
Robert Kern -
Travis E. Oliphant