Dear Lucas, I want the ability to reuse the bin numbers for a new input dataset. Indeed one should already be able to compute several statistics at once (and also for several datasets available at the same time). I have a PR ready to submit. Thank you for proposing to review it. Best regards, Edouard On Wed, Sep 18, 2019 at 9:59 PM <rlucas7@vt.edu> wrote:
On Sep 18, 2019, at 9:45 AM, scipy-dev-request@python.org wrote:
Send SciPy-Dev mailing list submissions to scipy-dev@python.org
To subscribe or unsubscribe via the World Wide Web, visit https://mail.python.org/mailman/listinfo/scipy-dev or, via email, send a message with subject or body 'help' to scipy-dev-request@python.org
You can reach the person managing the list at scipy-dev-owner@python.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of SciPy-Dev digest..."
Today's Topics:
1. Re: improvement to binned statistic (Ralf Gommers) 2. Adding alpha complexes/filtrations to scipy.spatial? (Hamilton, Wesley) 3. Re: Improvement to regular grid interpolation (Simon S. Clift)
----------------------------------------------------------------------
Message: 1 Date: Wed, 18 Sep 2019 15:02:17 +0200 From: Ralf Gommers <ralf.gommers@gmail.com> To: SciPy Developers List <scipy-dev@python.org> Subject: Re: [SciPy-Dev] improvement to binned statistic Message-ID: <CABL7CQhHJ-qJmbNnmJeGYATLKZQZCc6z9EB-RivXxKBUo8pscA@mail.gmail.com> Content-Type: text/plain; charset="utf-8"
Hi Edouard,
On Wed, Sep 18, 2019 at 11:29 AM Edouard Goudenhoofdt <egouden@gmail.com
wrote:
Dear scipy developers,
One could use scipy.stats.binned_statistic_dd for the same sample points but for values available at different times. Currently this involves the computation of the bin numbers every time the function is called. Therefore I would like to add an optional argument "binnumbers" to skip this step when calling the function again.
That seems sensible. Could you check that creating the bin numbers really takes the majority of the time? There's also a fair amount of input validation that shouldn't be skipped even when a new `binnumbers` is passed in. If that is the case, sending a PR with a benchmark would be very welcome.
Cheers, Ralf
IIUC Edouard what you’d like to do is take input data, run binned_statistic_dd() and then do the same thing with the bin edges calculated from this first call either on a new input dataset or on the same data(perhaps calculating on a new statistic?).
AFAIK the binned_statistic_dd() function isn’t able to take binedges as an argument. If you want multiple stats for the same data I think you can achieve that via a custom callable() that returns multiple statistics rather than a single scalar, but I haven’t done this so you should confirm that the approach would work fine.
If you want to take that up I’m happy to review the PR.
If not, and this is something others agree is useful and should be implemented, it seems reasonable to do. I can implement if you don’t have time or are otherwise unable to open a PR.
Let me know either way.
-Lucas Roberts _______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev