[Numpy-discussion] Changing the return type of np.histogramdd

Ralf Gommers ralf.gommers at gmail.com
Thu Apr 26 00:56:25 EDT 2018


On Mon, Apr 9, 2018 at 10:24 PM, Eric Wieser <wieser.eric+numpy at gmail.com>
wrote:

> Numpy has three histogram functions - histogram, histogram2d, and
> histogramdd.
>
> histogram is by far the most widely used, and in the absence of weights
> and normalization, returns an np.intp count for each bin.
>
> histogramdd (for which histogram2d is a wrapper) returns np.float64 in
> all circumstances.
>
> As a contrived comparison
>
> >>> x = np.linspace(0, 1)>>> h, e = np.histogram(x*x, bins=4); h
> array([25, 10,  8,  7], dtype=int64)>>> h, e = np.histogramdd((x*x,), bins=4); h
> array([25., 10.,  8.,  7.])
>
> https://github.com/numpy/numpy/issues/7845 tracks this inconsistency.
>
> The fix is now trivial: the question is, will changing the return type
> break people’s code?
>
> Either we should:
>
>    1. Just change it, and hope no one is broken by it
>    2. Add a dtype argument:
>       - If dtype=None, behave like np.histogram
>       - If dtype is not specified, emit a future warning recommending to
>       use dtype=None or dtype=float
>       - In future, change the default to None
>    3. Create a new better-named function histogram_nd, which can also be
>    created without the mistake that is https://github.com/numpy/
>    numpy/issues/10864.
>
> Thoughts?
>

(1)  sems like a no-go, taking such risks isn't justified by a minor
inconsistency.

(2) is still fairly intrusive, you're emitting warnings for everyone and
still force people to change their code (and if they don't they may run
into a backwards compat break).

(3) is the best of these options, however is this really worth a new
function? My vote would be "do nothing".

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180425/d0fd7539/attachment.html>


More information about the NumPy-Discussion mailing list