[Numpy-discussion] Rewrite np.histogram in c?

Robert McGibbon rmcgibbo at gmail.com
Mon Mar 16 02:06:43 EDT 2015


It might make sense to dispatch to difference c implements if the bins are
equally spaced (as created by using an integer for the np.histogram bins
argument), vs. non-equally-spaced bins.

In that case, getting the bigger speedup may be easier, at least for one
common use case.

-Robert

On Sun, Mar 15, 2015 at 11:00 PM, Jaime Fernández del Río <
jaime.frio at gmail.com> wrote:

> On Sun, Mar 15, 2015 at 9:32 PM, Robert McGibbon <rmcgibbo at gmail.com>
> wrote:
>
>> Hi,
>>
>> Numpy.histogram is implemented in python, and is a little sluggish. This
>> has been discussed previously on the mailing list, [1, 2]. It came up in a
>> project that I maintain, where a new feature is bottlenecked by
>> numpy.histogram, and one developer suggested a faster implementation in
>> cython [3].
>>
>> Would it make sense to reimplement this function in c? or cython? Is
>> moving functions like this from python to c to improve performance within
>> the scope of the development roadmap for numpy? I started implementing this
>> a little bit in c, [4] but I figured I should check in here first.
>>
>
> Where do you think the performance gains will come from? The PR in your
> project that claims a 10x speed-up uses a method that is only fit for
> equally spaced bins. I want to think that implementing that exact same
> algorithm in Python with NumPy would be comparably fast, say within 2x.
>
> For the general case, NumPy is already doing most of the heavy lifting
> (the sorting and the searching) in C: simply replicating the same
> algorithmic approach entirely in C is unlikely to provide any major
> speed-up. And if the change is to the algorithm, then we should first try
> it out in Python.
>
> That said, if you can speed things up 10x, I don't think there is going to
> be much opposition to moving it to C!
>
> Jaime
>
> --
> (\__/)
> ( O.o)
> ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
> de dominación mundial.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150315/0dffb1eb/attachment.html>


More information about the NumPy-Discussion mailing list