[Numpy-discussion] PR to add a function to calculate histogram edges without calculating the histogram

Nathaniel Smith njs at pobox.com
Thu Mar 15 23:13:47 EDT 2018


Instead of an nobs argument, maybe we should have a version that accepts
multiple data sets, so that we have the full information and can improve
the algorithm over time.

On Mar 15, 2018 7:57 PM, "Thomas Caswell" <tcaswell at gmail.com> wrote:

> Yes I like the name.
>
> The primary use-case for Matplotlib is that our `hist` method can take in
> a list of arrays and produces N histograms in one shot. Currently with
> 'auto' we only use the first data set to sort out what the bins should be
> and then re-use those for the rest of the data sets.  This will let us get
> the bins on the merged input, but I take Josef's point that this is not
> actually what we want....
>
> Tom
>
> On Mon, Mar 12, 2018 at 11:35 PM <josef.pktd at gmail.com> wrote:
>
>> On Mon, Mar 12, 2018 at 11:20 PM, Eric Wieser
>> <wieser.eric+numpy at gmail.com> wrote:
>> >> Given that the bin selection are data driven, transferring them across
>> datasets might not be so useful.
>> >
>> > The main application would be to compute bins across the union of all
>> > datasets. This is already possibly by using `np.histogram` and
>> > discarding the first result, but that's super wasteful.
>>
>> assuming "union" means a combined dataset.
>>
>> If you stack  datasets, then the number of observations will not be
>> correct for individual datasets.
>>
>> In that case an additional keyword like nobs, or whatever name would
>> be appropriate for numpy, would be useful, e.g. use the average number
>> of observations across datasets.
>> Auxiliary statistic like std could then be computed on the total
>> dataset (if that makes sense, which would not be the case if the
>> variance across datasets is larger than the variance within datasets.
>>
>> Josef
>>
>> > _______________________________________________
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion at python.org
>> > https://mail.python.org/mailman/listinfo/numpy-discussion
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180315/63cde9de/attachment.html>


More information about the NumPy-Discussion mailing list