[Numpy-discussion] Rewrite np.histogram in c?

Robert McGibbon rmcgibbo at gmail.com
Mon Mar 16 02:19:59 EDT 2015


My apologies for the typo: 'implements' -> 'implementations'

-Robert

On Sun, Mar 15, 2015 at 11:06 PM, Robert McGibbon <rmcgibbo at gmail.com>
wrote:

> It might make sense to dispatch to difference c implements if the bins are
> equally spaced (as created by using an integer for the np.histogram bins
> argument), vs. non-equally-spaced bins.
>
> In that case, getting the bigger speedup may be easier, at least for one
> common use case.
>
> -Robert
>
> On Sun, Mar 15, 2015 at 11:00 PM, Jaime Fernández del Río <
> jaime.frio at gmail.com> wrote:
>
>> On Sun, Mar 15, 2015 at 9:32 PM, Robert McGibbon <rmcgibbo at gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> Numpy.histogram is implemented in python, and is a little sluggish. This
>>> has been discussed previously on the mailing list, [1, 2]. It came up in a
>>> project that I maintain, where a new feature is bottlenecked by
>>> numpy.histogram, and one developer suggested a faster implementation in
>>> cython [3].
>>>
>>> Would it make sense to reimplement this function in c? or cython? Is
>>> moving functions like this from python to c to improve performance within
>>> the scope of the development roadmap for numpy? I started implementing this
>>> a little bit in c, [4] but I figured I should check in here first.
>>>
>>
>> Where do you think the performance gains will come from? The PR in your
>> project that claims a 10x speed-up uses a method that is only fit for
>> equally spaced bins. I want to think that implementing that exact same
>> algorithm in Python with NumPy would be comparably fast, say within 2x.
>>
>> For the general case, NumPy is already doing most of the heavy lifting
>> (the sorting and the searching) in C: simply replicating the same
>> algorithmic approach entirely in C is unlikely to provide any major
>> speed-up. And if the change is to the algorithm, then we should first try
>> it out in Python.
>>
>> That said, if you can speed things up 10x, I don't think there is going
>> to be much opposition to moving it to C!
>>
>> Jaime
>>
>> --
>> (\__/)
>> ( O.o)
>> ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
>> de dominación mundial.
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150315/d22f7d7d/attachment.html>


More information about the NumPy-Discussion mailing list