Basic spherical statistics in scipy

Hi everyone, recently questions came up with regards to statistics on the unit sphere in scipy. Issue 12041 <https://github.com/scipy/scipy/issues/12041> raises the question if spherical mean/variance should be added as descriptive statistics similar to circmean/circvar <https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.circvar.htm...> . The canonical reference for all things related to statistics on the unit sphere is "Directional Statistics" by Mardia & Jupp. Another important basic functionality is the generation of random samples on the unit sphere: discussion was started in this issue <https://github.com/scipy/scipy/issues/16205> . Any strong opinion against these? I think at least the random number generation is reimplemented every day around the world (done so myself at least 2 times) and would strongly benefit from being part of scipy. Best

Yes, I think both have a place in scipy.stats. Thanks! On Sun, Jun 5, 2022 at 5:55 AM Daniel Schmitz < danielschmitzsiegen@googlemail.com> wrote:
Hi everyone,
recently questions came up with regards to statistics on the unit sphere in scipy.
Issue 12041 <https://github.com/scipy/scipy/issues/12041> raises the question if spherical mean/variance should be added as descriptive statistics similar to circmean/circvar <https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.circvar.htm...> . The canonical reference for all things related to statistics on the unit sphere is "Directional Statistics" by Mardia & Jupp.
Another important basic functionality is the generation of random samples on the unit sphere: discussion was started in this issue <https://github.com/scipy/scipy/issues/16205> .
Any strong opinion against these? I think at least the random number generation is reimplemented every day around the world (done so myself at least 2 times) and would strongly benefit from being part of scipy.
Best _______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: robert.kern@gmail.com
-- Robert Kern

Hi Daniel, I also think these would be interesting. I am myself very interested in the sampling part, as we actually recently added poisson disk sampling which internally uses hypersphere sampling. I would be happy to help with this work. Cheers, Pamphile (@tupui)
On 5 Jun 2022, at 17:04, Robert Kern <robert.kern@gmail.com> wrote:
Yes, I think both have a place in scipy.stats. Thanks!
On Sun, Jun 5, 2022 at 5:55 AM Daniel Schmitz <danielschmitzsiegen@googlemail.com> wrote: Hi everyone,
recently questions came up with regards to statistics on the unit sphere in scipy.
Issue 12041 raises the question if spherical mean/variance should be added as descriptive statistics similar to circmean/circvar . The canonical reference for all things related to statistics on the unit sphere is "Directional Statistics" by Mardia & Jupp.
Another important basic functionality is the generation of random samples on the unit sphere: discussion was started in this issue .
Any strong opinion against these? I think at least the random number generation is reimplemented every day around the world (done so myself at least 2 times) and would strongly benefit from being part of scipy.
Best _______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: robert.kern@gmail.com
-- Robert Kern _______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: roy.pamphile@gmail.com

Hi team, Based on positive feedback in this thread and approval by two maintainers, gh-16435 <https://github.com/scipy/scipy/pull/16435> adding `scipy.stats.directionalmean` recently merged. However, I see that the original proposal did not link to that PR or mention the name or interface of `directionalmean` in particular, so I thought I'd invite those interested to continue the discussion at: https://github.com/scipy/scipy/pull/16435 or, for more general thoughts, https://github.com/scipy/scipy/issues/12041 Matt On Sun, Jun 5, 2022 at 11:10 AM Pamphile Roy <roy.pamphile@gmail.com> wrote:
Hi Daniel,
I also think these would be interesting.
I am myself very interested in the sampling part, as we actually recently added poisson disk sampling which internally uses hypersphere sampling. I would be happy to help with this work.
Cheers, Pamphile (@tupui)
On 5 Jun 2022, at 17:04, Robert Kern <robert.kern@gmail.com> wrote:
Yes, I think both have a place in scipy.stats. Thanks!
On Sun, Jun 5, 2022 at 5:55 AM Daniel Schmitz < danielschmitzsiegen@googlemail.com> wrote:
Hi everyone,
recently questions came up with regards to statistics on the unit sphere in scipy.
Issue 12041 <https://github.com/scipy/scipy/issues/12041> raises the question if spherical mean/variance should be added as descriptive statistics similar to circmean/circvar <https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.circvar.htm...> . The canonical reference for all things related to statistics on the unit sphere is "Directional Statistics" by Mardia & Jupp.
Another important basic functionality is the generation of random samples on the unit sphere: discussion was started in this issue <https://github.com/scipy/scipy/issues/16205> .
Any strong opinion against these? I think at least the random number generation is reimplemented every day around the world (done so myself at least 2 times) and would strongly benefit from being part of scipy.
Best _______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: robert.kern@gmail.com
-- Robert Kern _______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: roy.pamphile@gmail.com
_______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: mhaberla@calpoly.edu
-- Matt Haberland Assistant Professor BioResource and Agricultural Engineering 08A-3K, Cal Poly

Hi again everyone! So far we have implemented *scipy.stats.directionalmean* ( https://github.com/scipy/scipy/pull/16435) which was straight forward. For circular data, the result agrees with circmean. Take a look at the Docs at: https://scipy.github.io/devdocs/reference/generated/scipy.stats.directionalm... API wise it was decided that Input data must be shaped as (...., n) where n denotes the dimensionality of the vector data. Then the statistic is computed over the last axis. By default, input data are also normalized to unit vectors. Implementing the directional variance/dispersion is a bit more tricky. Unfortunately, there does not seem to be a universal definition of it in the literature. The most common definition by Mardia and Jupp is 2(1-R) where R is the mean resultant length of the input vectors. This is twice as much as for the circular variance where the definition is unambiguous. There are two options: *not implementing* directional variance or to name the function directional*_dispersion*. We would love to hear your thoughts on this here or in https://github.com/scipy/scipy/pull/16785 . Best Am So., 5. Juni 2022 um 20:09 Uhr schrieb Pamphile Roy < roy.pamphile@gmail.com>:
Hi Daniel,
I also think these would be interesting.
I am myself very interested in the sampling part, as we actually recently added poisson disk sampling which internally uses hypersphere sampling. I would be happy to help with this work.
Cheers, Pamphile (@tupui)
On 5 Jun 2022, at 17:04, Robert Kern <robert.kern@gmail.com> wrote:
Yes, I think both have a place in scipy.stats. Thanks!
On Sun, Jun 5, 2022 at 5:55 AM Daniel Schmitz < danielschmitzsiegen@googlemail.com> wrote:
Hi everyone,
recently questions came up with regards to statistics on the unit sphere in scipy.
Issue 12041 <https://github.com/scipy/scipy/issues/12041> raises the question if spherical mean/variance should be added as descriptive statistics similar to circmean/circvar <https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.circvar.htm...> . The canonical reference for all things related to statistics on the unit sphere is "Directional Statistics" by Mardia & Jupp.
Another important basic functionality is the generation of random samples on the unit sphere: discussion was started in this issue <https://github.com/scipy/scipy/issues/16205> .
Any strong opinion against these? I think at least the random number generation is reimplemented every day around the world (done so myself at least 2 times) and would strongly benefit from being part of scipy.
Best _______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: robert.kern@gmail.com
-- Robert Kern _______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: roy.pamphile@gmail.com
_______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: danielschmitzsiegen@googlemail.com

Hi again, Yesterday a PR was opened to sample uniformly from the surface of the unit sphere: ENH: Random direction distribution by dschmitz89 · Pull Request #17277 · scipy/scipy (github.com) <https://github.com/scipy/scipy/pull/17277> The API is the following: a multivariate distribution* scipy.stats.random_direction(dim) *with a *.rvs() *method. Setting the dimension is required. The *.rvs(size) *method follows the convention of numpy's multivariate_normal: for *size=(m, n)* it will generate samples of the shape *(m, n, dim).* If this sounds interesting to you, please join the discussion on github. Best Am So., 5. Juni 2022 um 20:09 Uhr schrieb Pamphile Roy < roy.pamphile@gmail.com>:
Hi Daniel,
I also think these would be interesting.
I am myself very interested in the sampling part, as we actually recently added poisson disk sampling which internally uses hypersphere sampling. I would be happy to help with this work.
Cheers, Pamphile (@tupui)
On 5 Jun 2022, at 17:04, Robert Kern <robert.kern@gmail.com> wrote:
Yes, I think both have a place in scipy.stats. Thanks!
On Sun, Jun 5, 2022 at 5:55 AM Daniel Schmitz < danielschmitzsiegen@googlemail.com> wrote:
Hi everyone,
recently questions came up with regards to statistics on the unit sphere in scipy.
Issue 12041 <https://github.com/scipy/scipy/issues/12041> raises the question if spherical mean/variance should be added as descriptive statistics similar to circmean/circvar <https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.circvar.htm...> . The canonical reference for all things related to statistics on the unit sphere is "Directional Statistics" by Mardia & Jupp.
Another important basic functionality is the generation of random samples on the unit sphere: discussion was started in this issue <https://github.com/scipy/scipy/issues/16205> .
Any strong opinion against these? I think at least the random number generation is reimplemented every day around the world (done so myself at least 2 times) and would strongly benefit from being part of scipy.
Best _______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: robert.kern@gmail.com
-- Robert Kern _______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: roy.pamphile@gmail.com
_______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: danielschmitzsiegen@googlemail.com

Hi again everyone, the first milestones proposed here have been implemented: - sampling from the hypersphere - directional sample statistics (direction mean and mean resultant length) I would like to propose to further add the most commonly used analogue of the normal distribution on the hypersphere: the von Mises-Fisher distribution <https://en.wikipedia.org/wiki/Von_Mises%E2%80%93Fisher_distribution> (vMF). A reference implementations for sampling from it is available in geomstats <https://github.com/geomstats/geomstats/blob/f30c491a6da8cab38be48029d09eda2b...> and fitting and evaluating pdf/logpdf should not be too difficult to implement by ourselves. Having worked with directional data a lot, I have seen many people struggle with these distributions. I do not think that all kinds of spherical distributions should become part of SciPy, but the vMF is so fundamental that it would be very valuable to the general community. Best Am Mo., 24. Okt. 2022 um 12:12 Uhr schrieb Daniel Schmitz < danielschmitzsiegen@googlemail.com>:
Hi again,
Yesterday a PR was opened to sample uniformly from the surface of the unit sphere: ENH: Random direction distribution by dschmitz89 · Pull Request #17277 · scipy/scipy (github.com) <https://github.com/scipy/scipy/pull/17277>
The API is the following: a multivariate distribution* scipy.stats.random_direction(dim) *with a *.rvs() *method. Setting the dimension is required. The *.rvs(size) *method follows the convention of numpy's multivariate_normal: for *size=(m, n)* it will generate samples of the shape *(m, n, dim).*
If this sounds interesting to you, please join the discussion on github.
Best
Am So., 5. Juni 2022 um 20:09 Uhr schrieb Pamphile Roy < roy.pamphile@gmail.com>:
Hi Daniel,
I also think these would be interesting.
I am myself very interested in the sampling part, as we actually recently added poisson disk sampling which internally uses hypersphere sampling. I would be happy to help with this work.
Cheers, Pamphile (@tupui)
On 5 Jun 2022, at 17:04, Robert Kern <robert.kern@gmail.com> wrote:
Yes, I think both have a place in scipy.stats. Thanks!
On Sun, Jun 5, 2022 at 5:55 AM Daniel Schmitz < danielschmitzsiegen@googlemail.com> wrote:
Hi everyone,
recently questions came up with regards to statistics on the unit sphere in scipy.
Issue 12041 <https://github.com/scipy/scipy/issues/12041> raises the question if spherical mean/variance should be added as descriptive statistics similar to circmean/circvar <https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.circvar.htm...> . The canonical reference for all things related to statistics on the unit sphere is "Directional Statistics" by Mardia & Jupp.
Another important basic functionality is the generation of random samples on the unit sphere: discussion was started in this issue <https://github.com/scipy/scipy/issues/16205> .
Any strong opinion against these? I think at least the random number generation is reimplemented every day around the world (done so myself at least 2 times) and would strongly benefit from being part of scipy.
Best _______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: robert.kern@gmail.com
-- Robert Kern _______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: roy.pamphile@gmail.com
_______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: danielschmitzsiegen@googlemail.com

On Sat, Nov 19, 2022 at 3:50 AM Daniel Schmitz < danielschmitzsiegen@googlemail.com> wrote:
Hi again everyone,
the first milestones proposed here have been implemented: - sampling from the hypersphere - directional sample statistics (direction mean and mean resultant length)
I would like to propose to further add the most commonly used analogue of the normal distribution on the hypersphere: the von Mises-Fisher distribution <https://en.wikipedia.org/wiki/Von_Mises%E2%80%93Fisher_distribution> (vMF). A reference implementations for sampling from it is available in geomstats <https://github.com/geomstats/geomstats/blob/f30c491a6da8cab38be48029d09eda2b...> and fitting and evaluating pdf/logpdf should not be too difficult to implement by ourselves.
Having worked with directional data a lot, I have seen many people struggle with these distributions. I do not think that all kinds of spherical distributions should become part of SciPy, but the vMF is so fundamental that it would be very valuable to the general community.
I think that's reasonable. -- Robert Kern

Hi SciPy, I opened a PR for the Von-Mises Fisher distribution: https://github.com/scipy/scipy/pull/17624 One thing needs more elaborate discussion: I added a fit method. So far, no other multivariate distribution has a fit method, so the API potentially sets a precedent. Currently it is implemented as fit(data) and returns the two distribution parameters `mu`, `kappa` . In principle, it is possible to also add the possibility for the user to fix one of the parameters, similar to what can be done with univariate distributions. If you have any thoughts on this, please join the discussion in the PR. Thanks! Am Sa., 19. Nov. 2022 um 17:17 Uhr schrieb Robert Kern < robert.kern@gmail.com>:
On Sat, Nov 19, 2022 at 3:50 AM Daniel Schmitz < danielschmitzsiegen@googlemail.com> wrote:
Hi again everyone,
the first milestones proposed here have been implemented: - sampling from the hypersphere - directional sample statistics (direction mean and mean resultant length)
I would like to propose to further add the most commonly used analogue of the normal distribution on the hypersphere: the von Mises-Fisher distribution <https://en.wikipedia.org/wiki/Von_Mises%E2%80%93Fisher_distribution> (vMF). A reference implementations for sampling from it is available in geomstats <https://github.com/geomstats/geomstats/blob/f30c491a6da8cab38be48029d09eda2b...> and fitting and evaluating pdf/logpdf should not be too difficult to implement by ourselves.
Having worked with directional data a lot, I have seen many people struggle with these distributions. I do not think that all kinds of spherical distributions should become part of SciPy, but the vMF is so fundamental that it would be very valuable to the general community.
I think that's reasonable.
-- Robert Kern _______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: danielschmitzsiegen@googlemail.com

Last call to review the PR for the von Mises distribution which will otherwise be merged in a week. PR link: https://github.com/scipy/scipy/pull/17624 Am Sa., 4. Feb. 2023 um 12:03 Uhr schrieb Daniel Schmitz < danielschmitzsiegen@googlemail.com>:
Hi SciPy,
I opened a PR for the Von-Mises Fisher distribution: https://github.com/scipy/scipy/pull/17624
One thing needs more elaborate discussion: I added a fit method. So far, no other multivariate distribution has a fit method, so the API potentially sets a precedent.
Currently it is implemented as fit(data) and returns the two distribution parameters `mu`, `kappa` . In principle, it is possible to also add the possibility for the user to fix one of the parameters, similar to what can be done with univariate distributions. If you have any thoughts on this, please join the discussion in the PR.
Thanks!
Am Sa., 19. Nov. 2022 um 17:17 Uhr schrieb Robert Kern < robert.kern@gmail.com>:
On Sat, Nov 19, 2022 at 3:50 AM Daniel Schmitz < danielschmitzsiegen@googlemail.com> wrote:
Hi again everyone,
the first milestones proposed here have been implemented: - sampling from the hypersphere - directional sample statistics (direction mean and mean resultant length)
I would like to propose to further add the most commonly used analogue of the normal distribution on the hypersphere: the von Mises-Fisher distribution <https://en.wikipedia.org/wiki/Von_Mises%E2%80%93Fisher_distribution> (vMF). A reference implementations for sampling from it is available in geomstats <https://github.com/geomstats/geomstats/blob/f30c491a6da8cab38be48029d09eda2b...> and fitting and evaluating pdf/logpdf should not be too difficult to implement by ourselves.
Having worked with directional data a lot, I have seen many people struggle with these distributions. I do not think that all kinds of spherical distributions should become part of SciPy, but the vMF is so fundamental that it would be very valuable to the general community.
I think that's reasonable.
-- Robert Kern _______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: danielschmitzsiegen@googlemail.com
participants (4)
-
Daniel Schmitz
-
Matt Haberland
-
Pamphile Roy
-
Robert Kern