[issue38490] statistics: add covariance and Pearson's correlation

Tymek Wołodźko report at bugs.python.org
Thu Oct 17 10:13:24 EDT 2019


Tymek Wołodźko <twolodzko at gmail.com> added the comment:

I expanded my PR to add simple linear regression. I also created
documentation for the new functionalities.

As about covariance, we can simply not expose it to the users, but I'm not
convinced that there is any gain in keeping it hidden from the users.

Tim

On Wed, Oct 16, 2019 at 11:25 AM Tymek Wołodźko <report at bugs.python.org>
wrote:

>
> Tymek Wołodźko <twolodzko at gmail.com> added the comment:
>
> In case there is agreement with Steven, I will add simple linear regression
> ( https://en.wikipedia.org/wiki/Simple_linear_regression ) in the same PR,
> since it is just:
>
> slope = correlation(x, y) * ( stdev(y) / stdev(x) )
> intercept = mean(y) - slope * mean(x)
>
> As about covariance, I see your points, but why not keeping it "because we
> can"? It can be useful for some users and the functionality still needs to
> be implemented to have correlation coefficient.
>
> On Wed, Oct 16, 2019 at 10:47 AM Steven D'Aprano <report at bugs.python.org>
> wrote:
>
> >
> > Steven D'Aprano <steve+python at pearwood.info> added the comment:
> >
> > I can't speak for other countries, but in Australia, secondary school
> > mathematics teaches correlation coefficient and linear regression from
> > Year 11 onwards (typically ages 16 or 17). Covariance is not itself
> > taught, and as far as I can tell neither the TI-83 nor NSpire
> > provides a built-in covariance command.
> >
> > On the other hand, other calculators such as the HP-48GX do.
> >
> > Oddly, Excel provides the population (not sample) covariance:
> >
> >
> >
> https://support.office.com/en-us/article/COVARIANCE-P-function-6F0E1E6D-956D-4E4B-9943-CFEF0BF9EDFC
> >
> > OpenOffice and LibreOffice also provide a covariance function.
> >
> > I think that supporting correlation coefficient `r` and linear
> > regression would be clear wins, from the perspective of secondary school
> > maths. But as far as covariance goes, it would help convince me if you
> > had either:
> >
> > - evidence that covariance is taught in secondary schools, or at
> >   least first year undergraduate statistics;
> >
> > - that it has use-cases beyond "helper for calculating r";
> >
> > - or that there is demand for it from people who want covariance
> >   but can't, or don't want to, use numpy/scipy.
> >
> > ----------
> >
> > _______________________________________
> > Python tracker <report at bugs.python.org>
> > <https://bugs.python.org/issue38490>
> > _______________________________________
> >
>
> ----------
>
> _______________________________________
> Python tracker <report at bugs.python.org>
> <https://bugs.python.org/issue38490>
> _______________________________________
>

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue38490>
_______________________________________


More information about the Python-bugs-list mailing list