<div dir="auto">Oh sure, I'm not suggesting it be impossible to calculate for a single data set. If nothing else, if we had a version that accepted a list of data sets, then you could always pass in a single-element list :-).</div><div class="gmail_extra"><br><div class="gmail_quote">On Mar 15, 2018 22:10, "Eric Wieser" <<a href="mailto:wieser.eric%2Bnumpy@gmail.com">wieser.eric+numpy@gmail.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>That sounds like a reasonable extension - but I think there still exist cases where you want to treat the data as one uniform set when computing bins (toggling between orthogonal subsets of data) so isn't really a useful replacement.</div><div><br></div><div>I suppose this becomes relevant when `density` is passed to the individual histogram invocations. Does matplotlib handle that correctly for stacked histograms?</div><div><br><div class="gmail_quote"><div dir="ltr">On Thu, Mar 15, 2018, 20:14 Nathaniel Smith <<a href="mailto:njs@pobox.com" target="_blank">njs@pobox.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="auto">Instead of an nobs argument, maybe we should have a version that accepts multiple data sets, so that we have the full information and can improve the algorithm over time.</div><div class="gmail_extra"><br><div class="gmail_quote">On Mar 15, 2018 7:57 PM, "Thomas Caswell" <<a href="mailto:tcaswell@gmail.com" target="_blank">tcaswell@gmail.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Yes I like the name.<div><br></div><div>The primary use-case for Matplotlib is that our `hist` method can take in a list of arrays and produces N histograms in one shot. Currently with 'auto' we only use the first data set to sort out what the bins should be and then re-use those for the rest of the data sets. This will let us get the bins on the merged input, but I take Josef's point that this is not actually what we want....</div><div><br></div><div>Tom</div></div><br><div class="gmail_quote"><div dir="ltr">On Mon, Mar 12, 2018 at 11:35 PM <<a href="mailto:josef.pktd@gmail.com" target="_blank">josef.pktd@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Mon, Mar 12, 2018 at 11:20 PM, Eric Wieser<br>
<<a href="mailto:wieser.eric%2Bnumpy@gmail.com" target="_blank">wieser.eric+numpy@gmail.com</a>> wrote:<br>
>> Given that the bin selection are data driven, transferring them across datasets might not be so useful.<br>
><br>
> The main application would be to compute bins across the union of all<br>
> datasets. This is already possibly by using `np.histogram` and<br>
> discarding the first result, but that's super wasteful.<br>
<br>
assuming "union" means a combined dataset.<br>
<br>
If you stack datasets, then the number of observations will not be<br>
correct for individual datasets.<br>
<br>
In that case an additional keyword like nobs, or whatever name would<br>
be appropriate for numpy, would be useful, e.g. use the average number<br>
of observations across datasets.<br>
Auxiliary statistic like std could then be computed on the total<br>
dataset (if that makes sense, which would not be the case if the<br>
variance across datasets is larger than the variance within datasets.<br>
<br>
Josef<br>
<br>
> ______________________________<wbr>_________________<br>
> NumPy-Discussion mailing list<br>
> <a href="mailto:NumPy-Discussion@python.org" target="_blank">NumPy-Discussion@python.org</a><br>
> <a href="https://mail.python.org/mailman/listinfo/numpy-discussion" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/numpy-<wbr>discussion</a><br>
______________________________<wbr>_________________<br>
NumPy-Discussion mailing list<br>
<a href="mailto:NumPy-Discussion@python.org" target="_blank">NumPy-Discussion@python.org</a><br>
<a href="https://mail.python.org/mailman/listinfo/numpy-discussion" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/numpy-<wbr>discussion</a><br>
</blockquote></div>
<br>______________________________<wbr>_________________<br>
NumPy-Discussion mailing list<br>
<a href="mailto:NumPy-Discussion@python.org" target="_blank">NumPy-Discussion@python.org</a><br>
<a href="https://mail.python.org/mailman/listinfo/numpy-discussion" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/numpy-<wbr>discussion</a><br>
<br></blockquote></div></div>
______________________________<wbr>_________________<br>
NumPy-Discussion mailing list<br>
<a href="mailto:NumPy-Discussion@python.org" target="_blank">NumPy-Discussion@python.org</a><br>
<a href="https://mail.python.org/mailman/listinfo/numpy-discussion" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/numpy-<wbr>discussion</a><br>
</blockquote></div></div>
<br>______________________________<wbr>_________________<br>
NumPy-Discussion mailing list<br>
<a href="mailto:NumPy-Discussion@python.org">NumPy-Discussion@python.org</a><br>
<a href="https://mail.python.org/mailman/listinfo/numpy-discussion" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/numpy-<wbr>discussion</a><br>
<br></blockquote></div></div>