Mailman 3 Re: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg) - NumPy-Discussion

newer
Adding weights to cov and corrcoef

Re: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg)

older
Re: [Numpy-discussion] Adding...

David Goldsmith

5 Mar 2014 5 Mar '14

6:21 p.m.

Date: Wed, 05 Mar 2014 17:45:47 +0100

...

From: Sebastian Berg <sebastian@sipsolutions.net> Subject: [Numpy-discussion] Adding weights to cov and corrcoef To: numpy-discussion@scipy.org Message-ID: <1394037947.21356.20.camel@sebastian-t440> Content-Type: text/plain; charset="UTF-8"

Hi all,

in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe suggested adding new parameters to our `cov` and `corrcoef` functions to implement weights, which already exists for `average` (the PR still needs to be adapted).

Do you mean adopted?

...

However, we may have missed something obvious, or maybe it is already getting too statistical for NumPy, or the keyword argument might be better `uncertainties` and `frequencies`. So comments and insights are very welcome :).

+1 for it being "too baroque" for NumPy--should go in SciPy (if it isn't already there): IMHO, NumPy should be kept as "lean and mean" as possible, embellishments are what SciPy is for. (Again, IMO.) DG

Attachments:

attachment.htm (text/html — 1.6 KB)

Show replies by date

Sebastian Berg

6 Mar 6 Mar

12:40 p.m.

New subject: Adding weights to cov and corrcoef (Sebastian Berg)

On Mi, 2014-03-05 at 10:21 -0800, David Goldsmith wrote:

...

Date: Wed, 05 Mar 2014 17:45:47 +0100 From: Sebastian Berg <sebastian@sipsolutions.net> Subject: [Numpy-discussion] Adding weights to cov and corrcoef To: numpy-discussion@scipy.org Message-ID: <1394037947.21356.20.camel@sebastian-t440> Content-Type: text/plain; charset="UTF-8"

Hi all,

in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe suggested adding new parameters to our `cov` and `corrcoef` functions to implement weights, which already exists for `average` (the PR still needs to be adapted).

Do you mean adopted?

What I meant was that the suggestion isn't actually implemented in the PR at this time. So you can't pull it in to try things out.

...

However, we may have missed something obvious, or maybe it is already getting too statistical for NumPy, or the keyword argument might be better `uncertainties` and `frequencies`. So comments and insights are very welcome :).

+1 for it being "too baroque" for NumPy--should go in SciPy (if it isn't already there): IMHO, NumPy should be kept as "lean and mean" as possible, embellishments are what SciPy is for. (Again, IMO.)

Well, on the other hand, scipy does not actually have a `std` function of its own, I think. So if it is quite useful I think this may be an option (I don't think I ever used weights with std, so I can't argue strongly for inclusion myself). Unless adding new functions to `scipy.stats` (or just statsmodels) which implement different types of weights is the longer term plan, then things might bite...

...

DG _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Ralf Gommers

8:49 p.m.

New subject: Adding weights to cov and corrcoef (Sebastian Berg)

On Thu, Mar 6, 2014 at 1:40 PM, Sebastian Berg <sebastian@sipsolutions.net>wrote:

...

On Mi, 2014-03-05 at 10:21 -0800, David Goldsmith wrote:

...
Date: Wed, 05 Mar 2014 17:45:47 +0100 From: Sebastian Berg <sebastian@sipsolutions.net> Subject: [Numpy-discussion] Adding weights to cov and corrcoef To: numpy-discussion@scipy.org Message-ID: <1394037947.21356.20.camel@sebastian-t440> Content-Type: text/plain; charset="UTF-8"

Hi all,

in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe suggested adding new parameters to our `cov` and `corrcoef` functions to implement weights, which already exists for `average` (the PR still needs to be adapted).

Do you mean adopted?

What I meant was that the suggestion isn't actually implemented in the PR at this time. So you can't pull it in to try things out.

...
However, we may have missed something obvious, or maybe it is already getting too statistical for NumPy, or the keyword argument might be better `uncertainties` and `frequencies`. So comments and insights are very welcome :).

+1 for it being "too baroque" for NumPy--should go in SciPy (if it isn't already there): IMHO, NumPy should be kept as "lean and mean" as possible, embellishments are what SciPy is for. (Again, IMO.)

Well, on the other hand, scipy does not actually have a `std` function of its own, I think. So if it is quite useful I think this may be an option (I don't think I ever used weights with std, so I can't argue strongly for inclusion myself). Unless adding new functions to `scipy.stats` (or just statsmodels) which implement different types of weights is the longer term plan, then things might bite...

AFAIK there's currently no such plan. Ralf

josef.pktd＠gmail.com

9:30 p.m.

New subject: Adding weights to cov and corrcoef (Sebastian Berg)

On Thu, Mar 6, 2014 at 3:49 PM, Ralf Gommers <ralf.gommers@gmail.com> wrote:

...

On Thu, Mar 6, 2014 at 1:40 PM, Sebastian Berg <sebastian@sipsolutions.net> wrote:

...
On Mi, 2014-03-05 at 10:21 -0800, David Goldsmith wrote:

...
Date: Wed, 05 Mar 2014 17:45:47 +0100 From: Sebastian Berg <sebastian@sipsolutions.net> Subject: [Numpy-discussion] Adding weights to cov and corrcoef To: numpy-discussion@scipy.org Message-ID: <1394037947.21356.20.camel@sebastian-t440> Content-Type: text/plain; charset="UTF-8"

Hi all,

in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe suggested adding new parameters to our `cov` and `corrcoef` functions to implement weights, which already exists for `average` (the PR still needs to be adapted).

Do you mean adopted?

What I meant was that the suggestion isn't actually implemented in the PR at this time. So you can't pull it in to try things out.

...
However, we may have missed something obvious, or maybe it is already getting too statistical for NumPy, or the keyword argument might be better `uncertainties` and `frequencies`. So comments and insights are very welcome :).

+1 for it being "too baroque" for NumPy--should go in SciPy (if it isn't already there): IMHO, NumPy should be kept as "lean and mean" as possible, embellishments are what SciPy is for. (Again, IMO.)

Well, on the other hand, scipy does not actually have a `std` function of its own, I think. So if it is quite useful I think this may be an option (I don't think I ever used weights with std, so I can't argue strongly for inclusion myself). Unless adding new functions to `scipy.stats` (or just statsmodels) which implement different types of weights is the longer term plan, then things might bite...

AFAIK there's currently no such plan.

since numpy has taken over all the basic statistics, var, std, cov, corrcoef, and scipy.stats dropped those, I don't see any reason to resurrect them. The only question IMO is which ddof for weighted std, ... statsmodels has the basic statistics with frequency weights, but they are largely in support of t-test and similar hypothesis tests. Josef

...

Ralf

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Sturla Molden

7 Mar 7 Mar

12:20 a.m.

New subject: Adding weights to cov and corrcoef (Sebastian Berg)

<josef.pktd@gmail.com> wrote:

...

The only question IMO is which ddof for weighted std, ...

Something like this? sum_weights - (ddof/float(n))*sum_weights Sturla

Sturla Molden

1:50 a.m.

New subject: Adding weights to cov and corrcoef (Sebastian Berg)

Sturla Molden <sturla.molden@gmail.com> wrote:

...

<josef.pktd@gmail.com> wrote:

...
The only question IMO is which ddof for weighted std, ...

Something like this?

sum_weights - (ddof/float(n))*sum_weights

Please ignore.

Sebastian Berg

12:43 a.m.

New subject: Adding weights to cov and corrcoef (Sebastian Berg)

On Do, 2014-03-06 at 16:30 -0500, josef.pktd@gmail.com wrote:

...

On Thu, Mar 6, 2014 at 3:49 PM, Ralf Gommers <ralf.gommers@gmail.com> wrote:

...
On Thu, Mar 6, 2014 at 1:40 PM, Sebastian Berg <sebastian@sipsolutions.net> wrote:

...
On Mi, 2014-03-05 at 10:21 -0800, David Goldsmith wrote:

...
Date: Wed, 05 Mar 2014 17:45:47 +0100 From: Sebastian Berg <sebastian@sipsolutions.net> Subject: [Numpy-discussion] Adding weights to cov and corrcoef To: numpy-discussion@scipy.org Message-ID: <1394037947.21356.20.camel@sebastian-t440> Content-Type: text/plain; charset="UTF-8"

Hi all,

in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe suggested adding new parameters to our `cov` and `corrcoef` functions to implement weights, which already exists for `average` (the PR still needs to be adapted).

Do you mean adopted?

What I meant was that the suggestion isn't actually implemented in the PR at this time. So you can't pull it in to try things out.

...
However, we may have missed something obvious, or maybe it is already getting too statistical for NumPy, or the keyword argument might be better `uncertainties` and `frequencies`. So comments and insights are very welcome :).

+1 for it being "too baroque" for NumPy--should go in SciPy (if it isn't already there): IMHO, NumPy should be kept as "lean and mean" as possible, embellishments are what SciPy is for. (Again, IMO.)

Well, on the other hand, scipy does not actually have a `std` function of its own, I think. So if it is quite useful I think this may be an option (I don't think I ever used weights with std, so I can't argue strongly for inclusion myself). Unless adding new functions to `scipy.stats` (or just statsmodels) which implement different types of weights is the longer term plan, then things might bite...

AFAIK there's currently no such plan.

since numpy has taken over all the basic statistics, var, std, cov, corrcoef, and scipy.stats dropped those, I don't see any reason to resurrect them.

The only question IMO is which ddof for weighted std, ...

I am right now a bit unsure about whether or not the "weights" would be "aweights" or different... R seems to not care about the scale of the weights which seems a bit odd to me for an unbiased estimator? I always assumed that we can do the statistics behind using the ddof... But even if we can figure out the right way, what I am doubting a bit is that if we add weights, their names should be clear enough to not clash with possibly different kind of (interesting) weights in other functions.

...

statsmodels has the basic statistics with frequency weights, but they are largely in support of t-test and similar hypothesis tests.

Josef

...
Ralf

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Sturla Molden

1:38 a.m.

New subject: Adding weights to cov and corrcoef (Sebastian Berg)

Sebastian Berg <sebastian@sipsolutions.net> wrote:

...

I am right now a bit unsure about whether or not the "weights" would be "aweights" or different... R seems to not care about the scale of the weights which seems a bit odd to me for an unbiased estimator? I always assumed that we can do the statistics behind using the ddof... But even if we can figure out the right way, what I am doubting a bit is that if we add weights, their names should be clear enough to not clash with possibly different kind of (interesting) weights in other functions.

http://en.wikipedia.org/wiki/Weighted_arithmetic_mean#Weighted_sample_covari...

josef.pktd＠gmail.com

4:32 a.m.

New subject: Adding weights to cov and corrcoef (Sebastian Berg)

On Thu, Mar 6, 2014 at 8:38 PM, Sturla Molden <sturla.molden@gmail.com> wrote:

...

Sebastian Berg <sebastian@sipsolutions.net> wrote:

...
I am right now a bit unsure about whether or not the "weights" would be "aweights" or different... R seems to not care about the scale of the weights which seems a bit odd to me for an unbiased estimator? I always assumed that we can do the statistics behind using the ddof... But even if we can figure out the right way, what I am doubting a bit is that if we add weights, their names should be clear enough to not clash with possibly different kind of (interesting) weights in other functions.

http://en.wikipedia.org/wiki/Weighted_arithmetic_mean#Weighted_sample_covari...

just as additional motivation (I'm not into definition of weights right now :) I was just reading a chapter on robust covariance estimation, and one of the steps in many of the procedures requires weighted covariances, and weighted variances. weights are just to reduce the influence of outlying observations. Josef

...

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

3943

Age (days ago)

3945

Last active (days ago)

List overview

Download

8 comments

5 participants

participants (5)

David Goldsmith
josef.pktd＠gmail.com
Ralf Gommers
Sebastian Berg
Sturla Molden

Re: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg)

David Goldsmith

Sebastian Berg

Ralf Gommers

josef.pktd＠gmail.com

Sturla Molden

Sturla Molden

Sebastian Berg

Sturla Molden

josef.pktd＠gmail.com

tags

participants (5)