[SciPy-Dev] SciPy-Dev Digest, Vol 191, Issue 15

Edouard Goudenhoofdt egouden at gmail.com
Mon Sep 23 05:12:02 EDT 2019


Dear Lucas,

I want the ability to reuse the bin numbers for a new input dataset.

Indeed one should already be able to compute several statistics at once
(and also for several datasets available at the same time).

I have a PR ready to submit.
Thank you for proposing to review it.

Best regards,

Edouard

On Wed, Sep 18, 2019 at 9:59 PM <rlucas7 at vt.edu> wrote:

>
> > On Sep 18, 2019, at 9:45 AM, scipy-dev-request at python.org wrote:
> >
> > Send SciPy-Dev mailing list submissions to
> >    scipy-dev at python.org
> >
> > To subscribe or unsubscribe via the World Wide Web, visit
> >    https://mail.python.org/mailman/listinfo/scipy-dev
> > or, via email, send a message with subject or body 'help' to
> >    scipy-dev-request at python.org
> >
> > You can reach the person managing the list at
> >    scipy-dev-owner at python.org
> >
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of SciPy-Dev digest..."
> >
> >
> > Today's Topics:
> >
> >   1. Re: improvement to binned statistic (Ralf Gommers)
> >   2. Adding alpha complexes/filtrations to scipy.spatial?
> >      (Hamilton, Wesley)
> >   3. Re: Improvement to regular grid interpolation (Simon S. Clift)
> >
> >
> > ----------------------------------------------------------------------
> >
> > Message: 1
> > Date: Wed, 18 Sep 2019 15:02:17 +0200
> > From: Ralf Gommers <ralf.gommers at gmail.com>
> > To: SciPy Developers List <scipy-dev at python.org>
> > Subject: Re: [SciPy-Dev] improvement to binned statistic
> > Message-ID:
> >    <CABL7CQhHJ-qJmbNnmJeGYATLKZQZCc6z9EB-RivXxKBUo8pscA at mail.gmail.com>
> > Content-Type: text/plain; charset="utf-8"
> >
> > Hi Edouard,
> >
> >
> > On Wed, Sep 18, 2019 at 11:29 AM Edouard Goudenhoofdt <egouden at gmail.com
> >
> > wrote:
> >
> >> Dear scipy developers,
> >>
> >> One could use scipy.stats.binned_statistic_dd for the same sample points
> >> but for values available at different times.
> >> Currently this involves the computation of the bin numbers every time
> the
> >> function is called.
> >> Therefore I would like to add an optional argument "binnumbers" to skip
> >> this step when calling the function again.
> >>
> >
> > That seems sensible. Could you check that creating the bin numbers really
> > takes the majority of the time? There's also a fair amount of input
> > validation that shouldn't be skipped even when a new `binnumbers` is
> passed
> > in. If that is the case, sending a PR with a benchmark would be very
> > welcome.
> >
> > Cheers,
> > Ralf
>
> IIUC Edouard what you’d like to do is take input data, run
> binned_statistic_dd() and then do the same thing with the bin edges
> calculated from this first call either on a new input dataset or on the
> same data(perhaps calculating on a new statistic?).
>
> AFAIK the binned_statistic_dd() function isn’t able to take binedges as an
> argument. If you want multiple stats for the same data I think you can
> achieve that via a custom callable() that returns multiple statistics
> rather than a single scalar, but I haven’t done this so you should confirm
> that the approach would work fine.
>
> If you want to take that up I’m happy to review the PR.
>
> If not, and this is something others agree is useful and should be
> implemented, it seems reasonable to do. I can implement if you don’t have
> time or are otherwise unable to open a PR.
>
> Let me know either way.
>
> -Lucas Roberts
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20190923/3415a273/attachment.html>


More information about the SciPy-Dev mailing list