[Numpy-discussion] Optimized np.digitize for equidistant bins

Martín Chalela tinchochalela at gmail.com
Fri Dec 18 14:36:57 EST 2020


Right! I just thought there would/should be a "digitize" function that did
this.

El vie, 18 dic 2020 a las 14:16, Joseph Fox-Rabinovitz (<
jfoxrabinovitz at gmail.com>) escribió:

> Bin index is just value floor divided by the bin size.
>
> On Fri, Dec 18, 2020, 09:59 Martín Chalela <tinchochalela at gmail.com>
> wrote:
>
>> Hi all! I was wondering if there is a way around to using np.digitize
>> when dealing with equidistant bins. For example:
>> bins = np.linspace(0, 1, 20)
>>
>> The main problem I encountered is that digitize calls np.searchsorted.
>> This is the correct way, I think, for generic bins, i.e. bins that have
>> different widths. However, in the special, but not uncommon, case of
>> equidistant bins, the searchsorted call can be very expensive and
>> unnecessary. One can perform a simple calculation like the following:
>>
>> def digitize_eqbins(x, bins):
>> """
>> Return the indices of the bins to which each value in input array belongs
>> .
>> Assumes equidistant bins.
>> """
>> nbins = len(bins) - 1
>> digit = (nbins * (x - bins[0]) / (bins[-1] - bins[0])).astype(np.int)
>> return digit + 1
>>
>> Is there a better way of computing this for equidistant bins?
>>
>> Thank you!
>> Martin.
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20201218/17d559b1/attachment.html>


More information about the NumPy-Discussion mailing list