[Numpy-discussion] Optimized np.digitize for equidistant bins

Joseph Fox-Rabinovitz jfoxrabinovitz at gmail.com
Fri Dec 18 12:15:43 EST 2020


Bin index is just value floor divided by the bin size.

On Fri, Dec 18, 2020, 09:59 Martín Chalela <tinchochalela at gmail.com> wrote:

> Hi all! I was wondering if there is a way around to using np.digitize when
> dealing with equidistant bins. For example:
> bins = np.linspace(0, 1, 20)
>
> The main problem I encountered is that digitize calls np.searchsorted.
> This is the correct way, I think, for generic bins, i.e. bins that have
> different widths. However, in the special, but not uncommon, case of
> equidistant bins, the searchsorted call can be very expensive and
> unnecessary. One can perform a simple calculation like the following:
>
> def digitize_eqbins(x, bins):
> """
> Return the indices of the bins to which each value in input array belongs.
> Assumes equidistant bins.
> """
> nbins = len(bins) - 1
> digit = (nbins * (x - bins[0]) / (bins[-1] - bins[0])).astype(np.int)
> return digit + 1
>
> Is there a better way of computing this for equidistant bins?
>
> Thank you!
> Martin.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20201218/078f8634/attachment.html>


More information about the NumPy-Discussion mailing list